Open biegekekeke opened 3 months ago
Hi @biegekekeke and sorry for the spam post above, these keep appearing since yesterday or so... Can you please send us some context:
qlever index
, do you have any custom settings in your QLeverfile, and what is the dataset?Hi @biegekekeke and sorry for the spam post above, these keep appearing since yesterday or so... Can you please send us some context:
>
- What type of system is this (X86 or ARM, which operating system in which version)
- how do you run QLever (natively compiled, via docker, via the qlever control script, from the command line) [EDIT: It seems you are running the control script via
qlever index
, do you have any custom settings in your QLeverfile, and what is the dataset?- What are the permissions on the folder you are
Thank you for your response.I am using the Ubuntu 20.04 operating system. For QLever, I installed it by running pip install qlever
and then used the qlever index
command.
The QLeverfile is as follows:
[data] NAME = wikidata GET_DATA_URL = https://dumps.wikimedia.org/wikidatawiki/entities GET_DATA_CMD = curl -LO -C - ${GET_DATA_URL}/latest-truthy.nt.bz2 ${GET_DATA_URL}/latest-lexemes.nt.bz2 INDEX_DESCRIPTION = "Full Wikidata dump from ${GET_DATA_URL} (latest-truthy.nt.bz2 and latest-lexemes.nt.bz2)"
[index] INPUT_FILES = wikidata-20231222-lexemes.nt.bz2 wikidata-20231222-truthy.nt.bz2 CAT_INPUT_FILES = bzcat ${INPUT_FILES} SETTINGS_JSON = { "languages-internal": ["en"], "prefixes-external": [ "<http://www.wikidata.org/entity/statement", "<http://www.wikidata.org/value", "<http://www.wikidata.org/reference" ], "locale": { "language": "en", "country": "US", "ignore-punctuation": true }, "ascii-prefixes-only": false, "num-triples-per-batch": 10000000 } WITH_TEXT_INDEX = false STXXL_MEMORY = 10g
[server] PORT = 7001 ACCESS_TOKEN = ${data:NAME}_832649627 MEMORY_FOR_QUERIES = 100G CACHE_MAX_SIZE = 100G
[runtime] SYSTEM = docker IMAGE = docker.io/adfreiburg/qlever:latest
[ui]
PORT = 7000
CONFIG = wikidata
The dataset is based on Wikidata, and my folder permissions are already set to rwx
.
olympics
in a separate folder) work for you, or do they show the same issues?ls -al
in the directory where you run the qlever index
command and post the output?Another idea: What is your file system (a "normal" ext4 disk, some fancy network mount or something, or something completely else)? How did you install docker, or do you have any special configurations for docker (security hardenings etc) that might lead to permission problems inside the container?
And can you also post the output of qlever index --show
(this logs what qlever index
does under the hood.
qlever index --show
When I use olympics
, the same issue also occurs. These are the permissions of the directory where I run the command:
drwxrwxrwx 2 xxx xxx 4096 Aug 28 12:54 .
drwxrwxrwx 11 xxx xxx 4096 Aug 28 15:34 ..
-rwxrwxrwx 1 xxx xxx 1422 Aug 28 15:44 Qleverfile
-rwxrwxrwx 1 xxx xxx 38 Aug 28 15:14 .stxxl
-rwxrwxrwx 1 xxx xxx 867124662 Aug 28 10:26 wikidata-20231222-lexemes.nt.bz2
-rwxrwxrwx 1 xxx xxx 41030599679 Aug 28 10:29 wikidata-20231222-truthy.nt.bz2
-rwxrwxrwx 1 xxx xxx 830 Aug 28 15:14 wikidata.index-log.txt
-rwxrwxrwx 1 xxx xxx 317 Aug 28 15:14 wikidata.settings.json
The file system I am using is a "normal" ext4 disk. When I run qlever index --show
, the output is as follows:
qlever index --show
Command: index
echo '{ "languages-internal": ["en"], "prefixes-external": [ "<http://www.wikidata.org/entity/statement", "<http://www.wikidata.org/value", "<http://www.wikidata.org/reference" ], "locale": { "language": "en", "country": "US", "ignore-punctuation": true }, "ascii-prefixes-only": false, "num-triples-per-batch": 10000000 }' > wikidata.settings.json
docker run --rm -u $(id -u):$(id -g) -v /etc/localtime:/etc/localtime:ro -v $(pwd):/index -w /index --init --entrypoint bash --name qlever.index.wikidata docker.io/adfreiburg/qlever:latest -c 'ulimit -Sn 1048576; bzcat wikidata-20231222-lexemes.nt.bz2 wikidata-20231222-truthy.nt.bz2 | IndexBuilderMain -F ttl -f - -i wikidata -s wikidata.settings.json --stxxl-memory 10g | tee wikidata.index-log.txt'
You called "qlever ... --show", therefore the command is only shown, but not executed (omit the "--show" to execute it)
@biegekekeke Are you using Docker inside of WSL (Windows Subsystem Linux)?
@biegekekeke您在 WSL(Windows 子系统 Linux)中使用 Docker 吗?
No
Okay, this sounds like some debuggin inside the Docker is required to track the concrete issue. I won't have time for this in the coming week, but after this we can tackle this. Are you proficient with using GDB/Docker etc.? In this case you could try debugging the call to IndexBuilderMain inside the Docker and send me a backtrace of the location where the error occurs, but this requires some particular computer science background.
2024-08-28 14:47:35.804 - INFO: QLever IndexBuilder, compiled on Tue Aug 27 16:54:45 UTC 2024 using git hash ac257c 2024-08-28 14:47:35.804 - INFO: You specified the input format: TTL 2024-08-28 14:47:35.804 - INFO: Processing input triples from /dev/stdin ... 2024-08-28 14:47:35.804 - INFO: You specified "locale = en_US" and "ignore-punctuation = 1" 2024-08-28 14:47:35.805 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files with a well-behaved use of newlines 2024-08-28 14:47:35.805 - INFO: You specified "num-triples-per-batch = 10,000,000", choose a lower value if the index builder runs out of memory 2024-08-28 14:47:35.805 - INFO: By default, integers that cannot be represented by QLever will throw an exception 2024-08-28 14:47:35.808 - ERROR: Operation not permitted