openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
867 stars 210 forks source link

Server crashes with Segmentation fault when running SPARQL federated query #734

Open melkamar opened 6 years ago

melkamar commented 6 years ago

Latest Virtuoso server crashes with segmentation fault when running the following query containing a federated sub-query.

When the I comment out the SERVICE{} block, the query runs correctly.

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX eu: <http://eulersharp.sourceforge.net/2003/03swap/log-rules#>
PREFIX ru: <http://purl.org/imbi/ru-meta.owl#>
prefix ex: <http://example.org/>
prefix s: <http://schema.org/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix ruian: <https://ruian.linked.opendata.cz/slovník/>

SELECT distinct ?dataObj ?locationObj ?lat ?long ?dataClassType ?__prefLab__ ?__name__
WHERE {
  BIND(<http://example.org/SourceObjectA> as ?dataClassType)

  ?dataObj a <http://example.org/SourceObjectA>;
         <http://a.property>/<http://a.property>/<http://link.to.ruian> ?locationObj
         .

  OPTIONAL {
    ?dataObj skos:prefLabel ?__prefLab__.
  }
  OPTIONAL {
    ?dataObj s:name ?__name.
  }

  #
  # Optional filter when reindexingto exclude all objects that already exist
  # Example contents of excludeDataObjects:
  #    ?dataObj != <http://example.org/linkedobject-24481611> &&
  #    ?dataObj != <http://example.org/linkedobject-72715057> &&
  #
  # This will filter out the two objects listed.
  # - Note that each line/expression MUST end with the && operator, including the last one,
  #   because there is a trailing True expression in the query template.
  #   The reason for that is to avoid parsing error thrown by FILTER() - there must be something in the parentheses.
  #
  FILTER(
    True
  )

  #
  # Mapping of selectProps - name of any selectProp must NOT be any of the reserved ones (dataObj, locationObj etc.)
  # Example mapping:
  #   ?dataObj ex:a/ex:b/ex:c ?selectPropA .
  #
  # There will be one line per each selectProp

  #
  # Federated query for the location sparql controller.
  #   [lat,long]LocationPathForLocationClass will contain a
  #   property path from the Location class to its coordinates.
  SERVICE <https://ruian.linked.opendata.cz/sparql> {
    ?locationObj ruian:adresníBod/s:geo/s:latitude ?lat;
                 ruian:adresníBod/s:geo/s:longitude ?long
    .

  }
}

The error message is not very helpful:

20:08:24 Server online at 1111 (pid 5)
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x8d119a]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x8d11f8]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x6b6388]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x6b78b7]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x6b6cf4]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x66a468]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x67a977]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x67dc90]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x67e869]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x67f24b]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x680d4b]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x680e5e]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x68fe17]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x83ca48]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x83e463]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x841996]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x552ae3]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x53bfd9]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5959da]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x59de67]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5c5242]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5ce0ca]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x595e21]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x59e2f7]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5c5242]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5ce0ca]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x595e21]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x597112]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x59b39b]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5c5158]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5cf6d1]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x5d0bc9]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x6d6583]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x4b1cdc]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x4b3722]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x4b39b6]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x4b3dca]
20:10:30 /opt/virtuoso-opensource/bin/virtuoso-t() [0x8dbdd3]
20:10:30 /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184) [0x7ff8d7ca4184]
20:10:30 /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7ff8d74c703d]
20:10:30 GPF: sparqld.c:329 ssg_fields_are_equal: unsupported tree type
GPF: sparqld.c:329 ssg_fields_are_equal: unsupported tree type
Segmentation fault (core dumped)
TallTed commented 6 years ago

Please provide the complete version string, i.e., the first full "paragraph" of output from the relevant command, for the instance with the segfault --

virtuoso -?
virtuoso-t -?
virtuoso-iodbc-t -?

I also note -- the remote SERVICE, https://ruian.linked.opendata.cz/sparql, appears to be down.

$ curl -LI https://ruian.linked.opendata.cz/sparql
HTTP/1.1 502 Bad Gateway
Server: nginx/1.13.12
Date: Thu, 19 Apr 2018 20:15:01 GMT
Content-Type: text/html
Content-Length: 174
Connection: keep-alive
melkamar commented 6 years ago

Yes, I tried running this query on that endpoint because I assumed my local setup is wrong, but it crashed that server as well.

Virtuoso Open Source Edition (Column Store) (multi threaded)
Version 7.2.5-dev.3217-pthreads as of Apr 19 2018 (f46a3f7)
Compiled for Linux (x86_64-unknown-linux-gnu)
Copyright (C) 1998-2018 OpenLink Software

I am running Virtuoso in docker, using an image built just now using instructions at http://vos.openlinksw.com/owiki/wiki/VOS/VirtuosoDocker

The query itself is correct, when using Jena Fuseki, it finishes without a problem.

TallTed commented 6 years ago

(After they restart...) What happens if you run just the subquery (LIMIT added here, for first test) on the other instance?

PREFIX      s:  <http://schema.org/>
PREFIX  ruian:  <https://ruian.linked.opendata.cz/slovník/>

SELECT * 
WHERE 
  {
    ?locationObj  ruian:adresníBod/s:geo/s:latitude   ?lat 
    ;             ruian:adresníBod/s:geo/s:longitude  ?long
    .
  }
LIMIT 10
melkamar commented 6 years ago

That finishes correctly, I get 10 results.

What I forgot to mention (not sure if it's important) is that I had to uncheck the "Strict checking of void variables" checkbox to run the original query. When I had it checked, I kept getting a compiler error below:

Virtuoso 37000 Error SP031: SPARQL compiler: The list of return values contains '*' but the pattern does not contain variables
FabienPuig commented 6 years ago

Hello, any progress on this issue? I have the exact same issue.... suspecting a network configuration issue between the 2 servers.... :(

TallTed commented 6 years ago

@FabienPuig - Network issues are extremely unlikely to lead to the GPF reported here. Additional details of your experience would be helpful for our analysis, such as:

TallTed commented 6 years ago

@melkamar - I'm sorry; I overlooked your last update.

I have just noticed something else I had overlooked previously. You have an illegal character in two CURIes in your query. ruian:adresníBod should be either ruian:adresn\U00EDBod with the encoded Unicode character, or the full IRI, <https://ruian.linked.opendata.cz/slovník/adresníBod>. Thus, the full subquery should be either this --

PREFIX      s:  <http://schema.org/>
PREFIX  ruian:  <https://ruian.linked.opendata.cz/slovník/>

SELECT * 
WHERE 
  {
    ?locationObj  <https://ruian.linked.opendata.cz/slovník/adresníBod>/s:geo/s:latitude   ?lat 
    ;             <https://ruian.linked.opendata.cz/slovník/adresníBod>/s:geo/s:longitude  ?long
    .
  }
LIMIT 10

-- or this --

PREFIX      s:  <http://schema.org/>
PREFIX  ruian:  <https://ruian.linked.opendata.cz/slovník/>

SELECT * 
WHERE 
  {
    ?locationObj  ruian:adresn\U00EDBod/s:geo/s:latitude   ?lat 
    ;             ruian:adresn\U00EDBod/s:geo/s:longitude  ?long
    .
  }
LIMIT 10

I ran the first against your intended SERVICE endpoint, and got the 10 solutions without issue. The second currently results in an error; that will be brought to development for their attention.

I also checked to see how many solutions would be found without that LIMIT -- and it seems that 2,837,194 might be more than you expected to retrieve, or that various pieces in the puzzle would be prepared to handle --

PREFIX      s:  <http://schema.org/>
PREFIX  ruian:  <https://ruian.linked.opendata.cz/slovník/>

SELECT ( COUNT(*) AS ?HowManySolutions )
WHERE 
  {
    ?locationObj  <https://ruian.linked.opendata.cz/slovník/adresníBod>/s:geo/s:latitude   ?lat 
    ;             <https://ruian.linked.opendata.cz/slovník/adresníBod>/s:geo/s:longitude  ?long
    .
  }

Given that count, I wonder whether you might benefit by adjusting the remote portion of your query, perhaps to be a secondary query based on the results of the rest...

tlomb commented 5 years ago

Dear virtuoso team, Are there any progresses on this issue?

I still see this segfault problem using the latest virtuoso-opensource tags/v7.2.5.1 (Version 07.20.3229-pthreads for Linux). I can reproduce it with different sparql queries, as soon as there is a SERVICE clause having two property paths on the same subject.

The problem can be summarized with the following sparql query. It doesn't seem to matter if the remote SERVICE that you use in the federated query exists or not.

PREFIX foo:<http://rdf.foo.bar/>
select * where {
  SERVICE <http://foo.bar/sparql> {
    ?fooSubj foo:bar1/foo:bar2 ?fooObj .
    ?fooSubj foo:bar3/foo:bar4 ?fooObj .
  }
}

... always gives the following segfault:

[user@[testserver] ~]$ tail -f /var/log/messages
[...]
Dec 17 16:50:50 [testserver] kernel: [46919049.920261] virtuoso-t[38313]: segfault at ffffffffffffffff ip 000000000090bd5a sp 00007fa617647cf0 error 7 in virtuoso-t[400000+c63000]
Dec 17 16:50:50 [testserver] abrt-hook-ccpp[25265]: Process 38286 (virtuoso-t) of user 21120 killed by SIGSEGV - dumping core

and the following details in the virtuoso logs:

[user@[testserver] db]$ tail -f /path/to/virtuoso[...]/virtuoso.log
[...]
16:56:13 Version 07.20.3229-pthreads for Linux as of Dec 13 2018
[...]

[user@[testserver] db]$ tail -f /path/to/virtuoso[...]/virtuoso.log

[...]
16:50:50 COMP_2 0 10.2.2.23 Internal Compile text:  sparql { define sql:big-data-const 0 
#output-format:text/html
PREFIX foo:<http://rdf.foo.bar/>
select * where {
  SERVICE <http://foo.bar/sparql> {
    ?fooObj foo:bar1/foo:bar2 ?fooSubj .
    ?fooObj foo:bar3/foo:bar4 ?fooSubj .
  }
}
}
16:50:50 /usr/bin/virtuoso-t() [0x90bc9a]
16:50:50 /usr/bin/virtuoso-t() [0x90bcf8]
16:50:50 /usr/bin/virtuoso-t() [0x6bf728]
16:50:50 /usr/bin/virtuoso-t() [0x6c0c25]
16:50:50 /usr/bin/virtuoso-t() [0x6c0064]
16:50:50 /usr/bin/virtuoso-t() [0x672248]
16:50:50 /usr/bin/virtuoso-t() [0x682b77]
16:50:50 /usr/bin/virtuoso-t() [0x686ab0]
16:50:50 /usr/bin/virtuoso-t() [0x68763a]
16:50:50 /usr/bin/virtuoso-t() [0x68803b]
16:50:50 /usr/bin/virtuoso-t() [0x689bc3]
16:50:50 /usr/bin/virtuoso-t() [0x6988a6]
16:50:50 /usr/bin/virtuoso-t() [0x85e2a8]
16:50:50 /usr/bin/virtuoso-t() [0x85f581]
16:50:50 /usr/bin/virtuoso-t() [0x862da6]
16:50:50 /usr/bin/virtuoso-t() [0x558fbb]
16:50:50 /usr/bin/virtuoso-t() [0x542939]
16:50:50 /usr/bin/virtuoso-t() [0x59aa47]
16:50:50 /usr/bin/virtuoso-t() [0x5a44e5]
16:50:50 /usr/bin/virtuoso-t() [0x5ca622]
16:50:50 /usr/bin/virtuoso-t() [0x5d37b3]
16:50:50 /usr/bin/virtuoso-t() [0x59ae1c]
16:50:50 /usr/bin/virtuoso-t() [0x5a48f5]
16:50:50 /usr/bin/virtuoso-t() [0x5ca622]
16:50:50 /usr/bin/virtuoso-t() [0x5d37b3]
16:50:50 /usr/bin/virtuoso-t() [0x59ae1c]
16:50:50 /usr/bin/virtuoso-t() [0x59c22a]
16:50:50 /usr/bin/virtuoso-t() [0x5a05bb]
16:50:50 /usr/bin/virtuoso-t() [0x5ca538]
16:50:50 /usr/bin/virtuoso-t() [0x5d4ca1]
16:50:50 /usr/bin/virtuoso-t() [0x5d6109]
16:50:50 /usr/bin/virtuoso-t() [0x6e30e3]
16:50:50 /usr/bin/virtuoso-t() [0x4b64fe]
16:50:50 /usr/bin/virtuoso-t() [0x4b7cf7]
16:50:50 /usr/bin/virtuoso-t() [0x4b7f36]
16:50:50 /usr/bin/virtuoso-t() [0x4b834a]
16:50:50 /usr/bin/virtuoso-t() [0x916f3f]
16:50:50 /lib64/libpthread.so.0(+0x7dc5) [0x7fa6ca1e6dc5]
16:50:50 /lib64/libc.so.6(clone+0x6d) [0x7fa6c980776d]
16:50:50 GPF: sparqld.c:329 ssg_fields_are_equal: unsupported tree type

(int. ref.: MET-683)

TallTed commented 5 years ago

@tlomb - This issue has been resolved in the latest Enterprise Edition (v8.2), and a fix is in the works for the Open Source Edition.

pkleef commented 5 years ago

@tlomb Please build the latest develop/7 branch and let us know if this resolves your sparql fed issues

tlomb commented 5 years ago

@pkleef Thanks for your code fix. I compiled the develop/7 git branch at commit b006ccd8d1c005c912cc4787efc2b5450c727346 (on linux centos) and the problem with the above sparql dummy query is gone. I correctly get "Virtuoso HTCLI Error HC001: Connection Error in HTTP Client" now instead of virtuoso-t segmentation fault. And other "real" sparql queries with existing remote sparql endpoint and more than one property path on the same subject also work now (i.e. give results instead of segfault).

Can you let me know when/if you can make a new virtuoso-opensource tag (v7.2.5.2?) including this fix?

tlomb commented 5 years ago

@pkleef @TallTed Thanks again for fixing this issue in the develop/7 branch. Follow-up: when will this fix be part of a new virtuoso-opensource tag (e.g. v7.2.5.2 or v8.x?) ?

DeniseSl22 commented 1 year ago

@TallTed , could you please inform us in which version of the Open Source Edition of Virtuoso this is fixed? We're having the same issue, and would find out how to resolve this.

TallTed commented 1 year ago

@tlomb @DeniseSl22 —

The patch discussed here was committed in Dec 2018, and the next VOS shipped was Release 7.2.6 (almost immediately replaced by 7.2.6.1). As usual, we would recommend that you update to the latest available, which is currently Release 7.2.10.

Please confirm the version (7.2.6.1 or later) to which you have updated, and whether you continue to see the issue described here with that update, so we may close the issue, or dig into what other cause there may be. If you're uncertain what version you're now running, the virtuoso-t -? command or this SPARQL query may be used to discover the version details.