Wikidata / soweego

Link Wikidata items to large catalogs
https://meta.wikimedia.org/wiki/Grants:Project/Hjfocs/soweego_2
GNU General Public License v3.0
95 stars 8 forks source link

mix'n'match client #326

Closed marfox closed 5 years ago

marfox commented 5 years ago

This PR introduces the ingestor facility for medium-confidence soweego links. It's a client that interacts with the mix'n'match standard DB (i.e., s51434__mixnmatch_p, not s51434__mixnmatch_large_catalogs_p). It performs the following actions:

  1. add catalog metadata;
  2. keep existing curated links;
  3. delete existing non-curated links;
  4. insert input links.

CLI: python -m soweego ingestor mix_n_match CATALOG ENTITY THRESHOLD PATH_TO_LINKS

Closes #303.