Open phucdoitoan opened 5 months ago
I haven't gotten a chance to try and reproduce the error, but it looks like at least one of the claim objects doesn't have a datatype
key. I haven't seen this error previously, so I wonder if it's something in most recent dump?
One small fix would be to disregard all claims which don't have a datatype
key, and then count how many you drop (or write them to some error log file)?
Hi there,
Thanks a lot for your reply.
I do not know much about wikidta so I'm not sure datatype
key is something recent.
I'll try your suggestion.
However, even with the error reported, it seems like the code works fine and all the output tables seem okie.
Hi,
Thank you for the useful github code.
When I run the code in preprocess_dump.py to process the lastest wikidata dump (as of April 16) with 28 processes, I got the following error with processes 28. However, the code seems still running and produce processed tables.
Do you know if the error is something I should care about or I can just ignore it?
Thank you a lot!
Process Process-28: Traceback (most recent call last): File "**/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "**/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 151, in process_data out_queue.put(process_json(ujson.loads(json_obj), language_id)) File "**/simple-wikidata-db/simple_wikidata_db/preprocess_utils/worker_process.py", line 91, in process_json datatype = claim['mainsnak']['datatype'] KeyError: 'datatype'