Closed joka921 closed 1 day ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 88.12%. Comparing base (
85793e3
) to head (b174038
). Report is 1 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
@ullingerc Even after this fix, there is still an inconsistency. Namely, in the TSV export, the points appear as POINT(...)
without quotes and without datatype, whereas all other geometry types appear in quotes and with datatype. The TSV format does not require a particular format, but we should of course make it consistent. Do you know how to fix this?
This may be caused by this review: https://github.com/ad-freiburg/qlever/pull/1506#discussion_r1770882688 I then removed the full string representation with quotes and datatype. Tomorrow I will look into it and check if this is actually the cause.
@hannahbast I have checked it and the bug has nothing to do specifically with my implementation in #1506 . It also affects the date type. See for example: https://github.com/ad-freiburg/qlever/blob/f39907ccfe28096199c817b289e5c9131cd27cc3/test/ExportQueryExecutionTreesTest.cpp#L500-L504
@ullingerc Thanks for the info, than this is work for a separate PR. But I just hit another problem, when trying to build the index for the latest version of Wikidata (which never had problems with coordinates in the past):
2024-09-28 15:26:12.867 - ERROR: Parse error at byte position 116541359260: /local/data-ssd/qlever/qlever-code/src/parser/GeoPoint.cpp, line 19: The given value 100 is out of range for latitude coordinates.
Any idea, what the problem is here? Anyway, it would be good to output the whole POINT...
string here.
@ullingerc PS: Shortly before the parse fails, I find the following in the dataset: "Point(0 99.999999)"^^geo:wktLiteral
. I understand that a latitude should not be larger than 90. But it would be good to be more lenient here and either omit the triple or not "fold" it but leave it as an ordinary literal (maybe without datatype). We already have warning messages for that at other places, for example:
2024-09-20 22:17:08.122 - WARN: IRI ref not standard-compliant: <http://rdf.ncbi.nlm.nih.gov/pubchem/patentipc/C4B 16-02>
@hannahbast It's very helpful that you found the problematic item, thanks. I think we should keep the verification for coordinates, otherwise the folding will no longer work in a meaningful way. However, I agree, the error should be handled more gracefully, similar to the example you have pointed out.
After the previous commit, all WKT literals except
POINT
s were missing thegeo:wktLiteral
datatype. This is now fixed again.