Closed Stiksels closed 1 month ago
@Stiksels Please set "parallel-parsing": true
in your SETTINGS_JSON
. The default has been changed to false
. We had reasons for that, but it was still a bad idea, because most or all of the preconfigured Qleverfiles don't set parallel-parsing
explicitly, and neither to most of Qleverfiles out there.
thanks @hannahbast that was the problem, with parallel-parsing set to true, the parsing speed is back up at 1.0M/s.
I see the following INFO and WARNING lines now:
2024-10-18 05:22:45.530 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files with a well-behaved use of newlines
2024-10-18 05:22:45.530 - WARN: Parallel parsing set to
truein the
.settings.jsonfile; this is deprecated, please use the command-line option --parse-parallel or -p instead
The issue from #1468 still remains, however: processing multiple large nquad files gets stuck on "merging partial vocabularies"
Speed with tag:latest 2024-10-17 20:15:07.644 - INFO: Triples parsed: 10,000,000 [average speed 0.3 M/s, last batch 0.3 M/s, fastest 0.3 M/s, slowest 0.3M/s]
Speed with tag:previous 2024-10-17 20:20:48.589 - INFO: Triples parsed: 10,000,000 [average speed 1.1 M/s, last batch 1.1 M/s, fastest 1.1 M/s, slowest 1.1 M/s]