openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
853 stars 211 forks source link

The requested subsequence of BLOB is longer than 10Mb, thus it cannot be stored as a string #1057

Open hussien opened 2 years ago

hussien commented 2 years ago

I have Microsoft academic graph loaded in a local Virtuoso endpoint. The graph is 10B+ triples. When I try to use isql to execute SPARQL query like —

/isql 127.0.0.1:1111 dba dba exec="set blobs on; sparql define output:format '"TSV"' SELECT DISTINCT ?s ?p ?o from <http://mag.org> WHERE { ?s a <http://mag/paper>. ?s ?p ?o. } " > output.tsv

— and export the results into a TSV file, I got the error "The requested subsequence of BLOB is longer than 10Mb, thus it cannot be stored as a string".

When I set the limit to 100K it works well.

My query returns 120M triples, how I can do that in a single query?

I have tried a batch of queries by setting a limit and offset but it takes a too long time.

HughWilliams commented 2 years ago

Can you provide:

  1. The output of running the following query to check the max length of the object values in the RDF_OBJ table:

    select max (length(RO_LONG)) from RDF_OBJ;

  2. Run __dbf_set ('callstack_on_exception', 1);, which should provide more details on the source of the error, and then re-run your query.

Also, does the error occur when other output formats are chosen?

hussien commented 2 years ago

Hi @HughWilliams , thanks for your reply, here are the outputs:

  1. max (length(RO_LONG)) = 165124

  2. after enabling the call-stack the detailed Error is Error 22023: [Virtuoso Driver][Virtuoso Server]HT057: The STRING session in string_output_string is longer than 10Mb. Either use substring to access it in parts or place less data in it. in string_output_string:(BIF), DB.DBA.RDF_FORMAT_RESULT_SET_AS_TSV_FIN.

also, it happens with different output formats (TSV, JSON, NT)

HughWilliams commented 2 years ago

Thanks for the additional information , we are looking into the cause of the error ...