TheScienceMuseum / elastic-wikidata

CLI for loading Wikidata subsets (or all of it) into Elasticsearch
https://www.sciencemuseumgroup.org.uk/project/heritage-connector/
MIT License
67 stars 7 forks source link

Part of claims missing after importing #16

Closed xzhaoyooo closed 3 years ago

xzhaoyooo commented 3 years ago

Hello Dutia,

I found a new problem that might be caused by elastic-wikidata. It seems only at most one property item in claims would be remained during importing. In every document the only property item I could find is P31. I guess there is something wrong within wd_entities.py.

I hope you can help me find out the reason. Thank you. 😺

kdutia commented 3 years ago

This might be up to my poor documentation. By default elastic-wikidata only imports P31 as well as the properties you specify using the --properties/-prop flag.

Did you specify anything using this flag? Would you want to have an option that imports all properties?

If you're looking for a way to just import the raw Wikidata JSONs into Elasticsearch I'd say its better to write a separate script. elastic-wikidata is meant for importing a small subset of Wikidata into Elasticsearch for a fast search index.

(or let me know, and I can add an option to circumvent simplify_wbgetentities_result)