openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
867 stars 210 forks source link

sparql slow if more object value specified #932

Open anatta8 opened 4 years ago

anatta8 commented 4 years ago

Hi , I am using Virtuoso 7 open source edition,

This query below is fast, it returns the result instantly,

select *
                WHERE {

                graph <http://localhost:8890/graph> {

                    ?catAtt qq:catId        ?catId;
                            qq:caDataType   ?caDataType;
                            qq:showInView   ?showInview;
                            qq:valFormat    ?valFormatKey;
                            qq:multiple     ?multiple; 
                            qq:position     ?position; 
                            qq:link         ?link; 
                            qq:catAttName   ?catAttName; 
                            qq:setting      ?setting; 
                            qq:flag         ?flag;
                            qq:unit         ?caUnit.

                  }  
                }

                LIMIT 20

This query is so slow, took 10 seconds for the result, the only difference is the ?catId is replaced with literal value 1. I thought this should be faster because it's more specific.

select *
                WHERE {

                graph <http://localhost:8890/graph> {

                    ?catAtt qq:catId        1;
                            qq:caDataType   ?caDataType;
                            qq:showInView   ?showInview;
                            qq:valFormat    ?valFormatKey;
                            qq:multiple     ?multiple; 
                            qq:position     ?position; 
                            qq:link         ?link; 
                            qq:catAttName   ?catAttName; 
                            qq:setting      ?setting; 
                            qq:flag         ?flag;
                            qq:unit         ?caUnit.

                  }  
                }

                LIMIT 20

Why is it so? I am still new to SPARQL and triplestore, seems like I have to do some indexing on the object?

I have tried creating the GOPS index with, I don't understand about the Partition part, is using O correct? What does that partition mean? Creating this index has improved the execution for the second query to 4 seconds, but still too slow for a DBMS.

CREATE COLUMN INDEX RDF_QUAD_GOPS
  ON RDF_QUAD (G, O, P, S)
  PARTITION (O VARCHAR (-1, 0hexffff));

Btw, does virtuoso stored triplet in the regular RDBMS table? in the table DB.DBA.RDF_QUAD? So Virtuoso does not have a native storage of graph? Or what I don't understand?

Thanks

HughWilliams commented 4 years ago

See the response to this issue on the OpenLink Community Forum ... Please post all further questions there ...