Open manonthegithub opened 2 years ago
Likely to be linked to: https://github.com/openlink/virtuoso-opensource/issues/571
Very likely not a gstore issue
Could you make a gstore version where it logs out all the drop graph and insert graph statements? I will try to reproduce it with standalone virtuoso then
Probably linked to the issue you posted but it's unsolved since 2016 - let's hope for the best :D Branch of databus-transfer repo to reproduce the issue: https://github.com/dbpedia/databus-transfer/tree/insert-stopped-debug
@holycrab13 added GSTORE_LOG_LEVEL env var, you can now set GSTORE_LOG_LEVEL=DEBUG in docker-compose to enable looking of queries
ok very strange issue…. so the query size doesnt matter. it happens on different queries, randomly. I mean: I i split the file into insrts of 100 triples, it fails randomly on different parts, sometime on first 200 triples, sometimes on 900 triples. No idea what is the problem, it is a bug in virtuoso, we can make a ticket there.
Restarting virtuoso and gstore and saving some other files first helps.
Posted repro to https://github.com/openlink/virtuoso-opensource/issues/571 hope they will be able to fix this soon
ok, I also reproduced the bug now. I made a test set for bash:
isql-vt 1111 dba password VERBOSE=ON i1.sparql.txt > 1.1.txt 2>1.2.txt
isql-vt 1111 dba password VERBOSE=ON i2.sparql.txt > 2.1.txt 2>2.2
I split the triples and ran them individually:
while read p; do
echo "----------------"
echo "$p"
isql-vt 1111 dba password VERBOSE=ON exec="sparql INSERT IN GRAPH <http://localhost:3002/g/test/mappings-geo-coordinates-mappingbased-2018.09.12-dataid.jsonld> { $p } ;"
echo "----------------"
done <triples.txt
seems like it is definitely the preview triples. When run individually they throw syntax errors: ri2.txt
Then I split the triples of i2 into no preview (i3) and only preview (i4): i3.sparql.txt i4.sparql.txt
Then I tested it again:
# no preview triples are loaded first. This seems to initiate the DB properly and sets up the graph. Then loading i1 and i2 still throw an error "-- More than 0 parameters, ignoring all the rest of the statement #line 1 "i2.sparql.txt"" but they do not corrupt the store any more.
isql-vt 1111 dba password VERBOSE=ON i3.sparql.txt > 3.1.txt 2>3.2.txt
isql-vt 1111 dba password VERBOSE=ON i1.sparql.txt > 1.1.txt 2>1.2.txt
isql-vt 1111 dba password VERBOSE=ON i2.sparql.txt > 2.1.txt 2>2.2.txt
# running i1 or i2 first which contain the preview property mess up the store:
isql-vt 1111 dba password VERBOSE=ON i1.sparql.txt > 1.1.txt 2>1.2.txt
isql-vt 1111 dba password VERBOSE=ON i2.sparql.txt > 2.1.txt 2>2.2.txt
isql-vt 1111 dba password VERBOSE=ON i3.sparql.txt > 3.1.txt 2>3.2.txt
Fazit: Overall this seems to be an encoding thing. ODBC/JDBC have certain control and macro characters like $. The preview triple was originally created by me in the old maven upload client. back then I already had trouble with creating these as it is -- until now -- unclear to me, what I needed to escape/encode exactly when putting RDF in RDF as a Literal. This get's potentiated by the different available syntaxes (ntriples, ttl, rdf/xml) plus also the they have to go into SPARQL which is yet another syntax and I am not sure, if SPARQL INSERT is exactly like turtle or has different details.
Solution suggestions:
Still uncertain:
@kurzum you should better post it there: https://github.com/openlink/virtuoso-opensource/issues/571 it is really happening at different moments and even places in the same data
https://github.com/dbpedia/databus-transfer/issues/1