openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
853 stars 211 forks source link

The requested subsequence of BLOB is longer than 10Mb, thus it cannot be stored as a string #1057

Open hussien opened 2 years ago

hussien commented 2 years ago

I have Microsoft academic graph loaded in a local Virtuoso endpoint. The graph is 10B+ triples. When I try to use isql to execute SPARQL query like —

/isql dba dba exec="set blobs on; sparql define output:format '"TSV"' SELECT DISTINCT ?s ?p ?o from <> WHERE { ?s a <http://mag/paper>. ?s ?p ?o. } " > output.tsv

— and export the results into a TSV file, I got the error "The requested subsequence of BLOB is longer than 10Mb, thus it cannot be stored as a string".

When I set the limit to 100K it works well.

My query returns 120M triples, how I can do that in a single query?

I have tried a batch of queries by setting a limit and offset but it takes a too long time.

HughWilliams commented 2 years ago

Can you provide:

  1. The output of running the following query to check the max length of the object values in the RDF_OBJ table:

    select max (length(RO_LONG)) from RDF_OBJ;

  2. Run __dbf_set ('callstack_on_exception', 1);, which should provide more details on the source of the error, and then re-run your query.

Also, does the error occur when other output formats are chosen?

hussien commented 2 years ago

Hi @HughWilliams , thanks for your reply, here are the outputs:

  1. max (length(RO_LONG)) = 165124

  2. after enabling the call-stack the detailed Error is Error 22023: [Virtuoso Driver][Virtuoso Server]HT057: The STRING session in string_output_string is longer than 10Mb. Either use substring to access it in parts or place less data in it. in string_output_string:(BIF), DB.DBA.RDF_FORMAT_RESULT_SET_AS_TSV_FIN.

also, it happens with different output formats (TSV, JSON, NT)

HughWilliams commented 2 years ago

Thanks for the additional information , we are looking into the cause of the error ...