openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
854 stars 211 forks source link

`BIND` or `FILTER` -> query error `SR012: Function aref needs a string or an array as argument 1` #533

Open JervenBolleman opened 8 years ago

JervenBolleman commented 8 years ago

The following query works.

PREFIX core:<http://purl.uniprot.org/core/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 

select ?protein ?dr ?xref {
  FILTER(<http://purl.uniprot.org/uniprot/P05067> = ?protein)
      ?protein core:annotation ?annotation .
      ?annotation a core:Natural_Variant_Annotation .
      ?annotation rdfs:comment ?text .
      ?annotation core:substitution ?substitution .
      ?annotation core:range [faldo:begin [faldo:position ?location]] .
      ?statement rdf:object ?annotation .
      ?statement core:attribution ?ref .
      ?ref core:source ?citation .
      ?protein rdfs:seeAlso ?dr .
      ?dr core:database <http://purl.uniprot.org/database/Ensembl> .
      SERVICE <http://identifiers.org/services/sparql>{
            ?dr owl:sameAs ?xref . 
      }
  }

This one

PREFIX core:<http://purl.uniprot.org/core/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
PREFIX skos:<http://www.w3.org/2004/02/skos/core#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 

select ?protein ?dr ?xref {
  BIND(<http://purl.uniprot.org/uniprot/P05067> AS ?protein)
      ?protein core:annotation ?annotation .
      ?annotation a core:Natural_Variant_Annotation .
      ?annotation rdfs:comment ?text .
      ?annotation core:substitution ?substitution .
      ?annotation core:range [faldo:begin [faldo:position ?location]] .
      ?statement rdf:object ?annotation .
      ?statement core:attribution ?ref .
      ?ref core:source ?citation .
      ?protein rdfs:seeAlso ?dr .
      ?dr core:database <http://purl.uniprot.org/database/Ensembl> .
      SERVICE <http://identifiers.org/services/sparql>{
            ?dr owl:sameAs ?xref . 
      }
  }

Results in

SR012: Function aref needs a string or an array as argument 1, not an arg of type DB_NULL (204)

The shorter query form

PREFIX core:<http://purl.uniprot.org/core/> 
PREFIX keywords:<http://purl.uniprot.org/keywords/> 
PREFIX uniprotkb:<http://purl.uniprot.org/uniprot/> 
PREFIX taxon:<http://purl.uniprot.org/taxonomy/> 
PREFIX ec:<http://purl.uniprot.org/enzyme/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#> 
PREFIX skos:<http://www.w3.org/2004/02/skos/core#> 
PREFIX owl:<http://www.w3.org/2002/07/owl#> 
PREFIX bibo:<http://purl.org/ontology/bibo/> 
PREFIX dc:<http://purl.org/dc/terms/> 
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 

select ?protein ?dr ?xref {
  BIND(<http://purl.uniprot.org/uniprot/P05067> AS ?protein)
      ?protein rdfs:seeAlso ?dr .
      ?dr core:database <http://purl.uniprot.org/database/Ensembl> .
      SERVICE <http://identifiers.org/services/sparql>{
            ?dr owl:sameAs ?xref . 
      }
  }

does not generate an error either

HughWilliams commented 8 years ago

@JervenBolleman: You seem to have the first and second queries the wrong way around as the first one gives the error you report when run against our LOD server and the second one executes successfully and but returns no results ... So we will look into this ...

Alain-Gateau commented 4 years ago

I came to the same bug in other circumstances, this is a query ran from the neXtProt's endpoint wrapper (https://snorql.nextprot.org):

select distinct ?iso ?spos ?epos ?annot_type ?txt where {
  values ?ensp {"ENSP00000446475" "ENSP00000265436"}
  values ?poi {400} # position of interest
  #values ?ensp {"ENSP00000383211"}
  bind (IRI(CONCAT("http://rdf.ebi.ac.uk/resource/ensembl.protein/",?ensp)) as ?ENSP_IRI) 
  SERVICE <http://sparql.uniprot.org/sparql> {
     SELECT * WHERE 
    {
    ?enst up:translatedTo ?ENSP_IRI .
    ?enst rdfs:seeAlso  ?upiso .
    }
   }
  BIND(IRI(replace(str(?upiso),"http://purl.uniprot.org/isoforms/","http://nextprot.org/rdf/isoform/NX_")) AS ?iso) .  ?entry :reference ?ref .
  ?entry :isoform ?iso .
  ?iso :positionalAnnotation ?statement .
  optional {?statement rdfs:comment ?txt .}
  #?statement rdfs:comment ?txt .
  ?statement a ?annot_type .
  ?statement :start ?spos; :end ?epos .
  filter((?spos <= ?poi) && (?epos >= ?poi)) # select annotations encompassing the position of interest
  } order by ?iso ?spos

It triggers the SR012 function aref... error But if I remove the optional {...} for the statement it works fine. Also if i have only one value in the ?ensp variabls, then it works fine even with the optional

tfrancart commented 1 year ago

Seems this error is triggered on a combination of OPTIONAL + SERVICE

HughWilliams commented 1 year ago

If you are referring to the query above running against https://snorql.nextprot.org/ , which is running a Virtuoso open source build from 2021, I would suggest upgrading it to the just announced Virtuoso 7.2.10 release , as the query runs against that build ...

tfrancart commented 1 year ago

I was not referring to this precise query, but to another query of mine that triggers the same error using a combination of OPTIONAL + SERVICE, and as mentioned above, removing the OPTIONAL does not trigger the error.

Unfortunately I don't which Virtuoso version the query runs against.

HughWilliams commented 1 year ago

What is your query, and is the Virtuoso SPARQL endpoint it is running against publicly accessible?

Note the Virtuoso version details can be obtained from the SPARQL endpoint as detailed here.

tfrancart commented 1 year ago

Generates the error:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ProvidedCHO_1 ?ProvidedCHO_1_label ?ScopeNote_4 WHERE {
  {
    SELECT * WHERE {
      ?ProvidedCHO_1 rdf:type <http://www.europeana.eu/schemas/edm/ProvidedCHO>.
      OPTIONAL {
        ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/title>) ?ProvidedCHO_1_label.
        FILTER((LANG(?ProvidedCHO_1_label)) = "en")
      }
      ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/type>) ?Type_2.
    }
  }
  SERVICE <https://vocab.getty.edu/sparql> {
    ?Type_2 (<http://www.w3.org/2004/02/skos/core#scopeNote>/rdf:value) ?ScopeNote_4.
    FILTER((LANG(?ScopeNote_4)) = "en")
  }
}
LIMIT 1000

Does not generate the error, without the OPTIONAL (but does not return results either, but that's a different issue):

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ProvidedCHO_1 ?ProvidedCHO_1_label ?ScopeNote_4 WHERE {
  {
    SELECT * WHERE {
      ?ProvidedCHO_1 rdf:type <http://www.europeana.eu/schemas/edm/ProvidedCHO>.
      ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/title>) ?ProvidedCHO_1_label.
      FILTER((LANG(?ProvidedCHO_1_label)) = "en")
      ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/type>) ?Type_2.
    }
  }
  SERVICE <https://vocab.getty.edu/sparql> {
    ?Type_2 (<http://www.w3.org/2004/02/skos/core#scopeNote>/rdf:value) ?ScopeNote_4.
    FILTER((LANG(?ScopeNote_4)) = "en")
  }
}
LIMIT 1000

Endpoint is https://sage-ails.ails.ece.ntua.gr/api/content/semanticsearch-digital-repository-of-ireland/sparql (may not stay here for a long time), and version query returns this:

image

TallTed commented 1 year ago

I would strongly advise that the sage-ails administrators update their Virtuoso components to a current build, rather than sticking with their current two-year-old components, which are 778 commits behind current.

If this issue persists with current builds, we will certainly dig further into it.

tfrancart commented 1 year ago

The below query, executed against https://sage-ails.ails.ece.ntua.gr/api/content/semanticsearch-digital-repository-of-ireland/sparql, which runs against Virtuoso 7.20.3237 (see screenshot below) still generates the error Virtuoso 22023 Error SR012: Function aref needs a string or an array as argument 1, not an arg of type DB_NULL (204)

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ProvidedCHO_1 ?ProvidedCHO_1_label ?ScopeNote_4 WHERE {
  {
    SELECT * WHERE {
      ?ProvidedCHO_1 rdf:type <http://www.europeana.eu/schemas/edm/ProvidedCHO>.
      OPTIONAL {
        ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/title>) ?ProvidedCHO_1_label.
        FILTER((LANG(?ProvidedCHO_1_label)) = "en")
      }
      ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/type>) ?Type_2.
    }
  }
  SERVICE <https://vocab.getty.edu/sparql> {
    ?Type_2 (<http://www.w3.org/2004/02/skos/core#scopeNote>/rdf:value) ?ScopeNote_4.
    FILTER((LANG(?ScopeNote_4)) = "en")
  }
}
LIMIT 1000

Virtuoso version:

image

pkleef commented 1 year ago

Could you please provide us with the output for all of the commands below:

virtuoso-t -?

Then connect to your virtuoso instance using the isql tool and run the following 2 commands:

SQL> status();

and

SQL> select count(*) from rdf_quad where o is null;

Finally run the following command :

SQL> __dbf_set ('callstack_on_exception', 2);

and then run your query again in your browser.

This should give you a more comprehensive stack track in your browser, which our developers would like to see.

tfrancart commented 1 year ago
/opt/virtuoso-opensource/database# virtuoso-t -?
Virtuoso Open Source Edition (Column Store) (multi threaded)
Version 7.2.10.3237-pthreads as of Jun  7 2023 (f3d88f16b)
Compiled for Linux (x86_64-ubuntu_bionic-linux-gnu)
Copyright (C) 1998-2023 OpenLink Software

Usage:
  virtuoso-t [-fcnCbDARwMKrBd] [+foreground] [+configfile arg] [+no-checkpoint]
             [+checkpoint-only] [+backup-dump] [+crash-dump]
             [+crash-dump-data-ini arg] [+restore-crash-dump] [+wait]
             [+mode arg] [+dumpkeys arg] [+restore-backup arg]
             [+backup-dirs arg] [+debug] [+pwdold arg] [+pwddba arg]
             [+pwddav arg]
  +foreground            run in the foreground
  +configfile            use alternate configuration file
  +no-checkpoint         do not checkpoint on startup
  +checkpoint-only       exit as soon as checkpoint on startup is complete
  +backup-dump           dump database into the transaction log, then exit
  +crash-dump            dump inconsistent database into the transaction log, then exit
  +crash-dump-data-ini   specify the DB ini to use for reading the data to dump
  +restore-crash-dump    restore from a crash-dump
  +wait                  wait for background initialization to complete
  +mode                  specify mode options for server startup (onbalr)
  +dumpkeys              specify key id(s) to dump on crash dump (default : all)
  +restore-backup        restore from online backup
  +backup-dirs           default backup directories
  +debug                 Show additional debugging info
  +pwdold                Old DBA password
  +pwddba                New DBA password
  +pwddav                New DAV password

SQL> status();
REPORT
VARCHAR
_______________________________________________________________________________

OpenLink Virtuoso  Server
Version 07.20.3237-pthreads for Linux as of Jun  7 2023 (f3d88f16b)
Started on: 2023-06-21 11:15 GMT+0

Database Status:
  File size 692060160, 84480 pages, 43308 free.
  680000 buffers, 29087 used, 41 dirty 0 wired down, repl age 0 0 w. io 0 w/crsr.
  Disk Usage: 29369 reads avg 0 msec, 0% r 0% w last  35477 s, 8175 writes flush        128 MB/s,
    192 read ahead, batch = 143.  Autocompact 45 in 43 out, 4% saved col ac: 4026 in 3% saved.
Gate:  574 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. 
Log = virtuoso.trx, 4042 bytes
40430 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 3 connects, max 2 concurrent
RPC: 83 calls, 1 pending, 1 max until now, 0 queued, 0 burst reads (0%), 0 second 0M large, 23M max
Checkpoint Remap 665 pages, 0 mapped back. 1 s atomic time.
    DB master 84480 total 43308 free 665 remap 3 mapped back
   temp  1024 total 1019 free

Lock Status: 0 deadlocks of which 0 2r1w, 0 waits,
   Currently 1 threads running 0 threads waiting 0 threads in vdb.
Pending:

Client 1111:3:  Account: dba, 200 bytes in, 286 bytes out, 1 stmts.
PID: 44, OS: unix, Application: unknown, IP#: 127.0.0.1
Transaction status: PENDING, 1 threads.
Locks: 

Client 1111:1:  Account: dba, 76336 bytes in, 2571 bytes out, 0 stmts.
PID: 0, OS: Linux, Application: JDBC, IP#: 147.102.11.56
Transaction status: PENDING, 0 threads.
Locks: 

Running Statements:
 Time (msec) Text
          11 status()

Hash indexes

42 Rows. -- 11 msec.

SQL> select count(*) from rdf_quad where o is null;
count
INTEGER
_______________________________________________________________________________

0

1 Rows. -- 0 msec.

Response of SPARQL endpoint for given query

Virtuoso 22023 Error SR012: Function aref needs a string or an array as argument 1, not an arg of type DB_NULL (204)
in
aref:(BIF),
        __01 => NULL,
        __02 => 0,
DB.DBA.SPARQL_SINV_IMP,
  ws_endpoint => 'https://vocab.getty.edu/sparql',
   ws_params => (ARRAY_OF_POINTER value, tag 193),
  qtext_template => ' SELECT ?ScopeNote_4
 WHERE {  ?!000001 ( <http://www.w3.org/2004/02/skos/core#scopeNote> / <http://' (truncated),
  qtext_posmap => (NVARCHAR value, tag 225),
   param_row => NULL,
  expected_vars => (ARRAY_OF_POINTER value, tag 193)

SPARQL query:
define sql:big-data-const 0
#output-format:text/html
define sql:signal-void-variables 1
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?ProvidedCHO_1 ?ProvidedCHO_1_label ?ScopeNote_4 WHERE {
  {
    SELECT * WHERE {
      ?ProvidedCHO_1 rdf:type <http://www.europeana.eu/schemas/edm/ProvidedCHO>.
      OPTIONAL {
        ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/title>) ?ProvidedCHO_1_label.
        FILTER((LANG(?ProvidedCHO_1_label)) = "en")
      }
      ?ProvidedCHO_1 (^<http://www.openarchives.org/ore/terms/proxyFor>/<http://purl.org/dc/elements/1.1/type>) ?Type_2.
    }
  }
  SERVICE <https://vocab.getty.edu/sparql> {
    ?Type_2 (<http://www.w3.org/2004/02/skos/core#scopeNote>/rdf:value) ?ScopeNote_4.
    FILTER((LANG(?ScopeNote_4)) = "en")
  }
}
LIMIT 1000
pkleef commented 1 year ago

We have some idea of what the internal issue is, however to fully debug this issue, we need to load the dataset ourselves so we can setup a db in our office.

Is this dataset freely available for us to download?

tfrancart commented 1 year ago

@pkleef dataset is available for download at https://sage-ails.ails.ece.ntua.gr/api/content/semanticsearch-digital-repository-of-ireland/distribution/ttl Thanks