scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
23 stars 25 forks source link

Include option to additionally retrieve external IDs for data #59

Open wkyoshida opened 8 months ago

wkyoshida commented 8 months ago

Terms

Languages

ALL

Description

This issue is to discuss an option (i.e. a flag perhaps) to also retrieve external IDs for data when running the data process (this is optional, as I'm thinking this should probably be something to opt-in, i.e. not the default behavior). On the Scribe-Server side, this information could be later useful for tracking when specific data points are new or have been updated in the external sources Scribe references, e.g. Wikidata. For those interested, it could also potentially be useful to see the IDs.

Open for discussion! :blush::eyes:

andrewtavis commented 7 months ago

Hey @wkyoshida πŸ‘‹ FYI I made a new issue in iOS that speaks to this even being something that we could include in the app data files 😊 See https://github.com/scribe-org/Scribe-iOS/issues/400. What that's saying is when we have a verb conjugation not showing up, this could actually be a link to the Wikidata page for the given lexeme such that the person could then enter in the conjugation and have it show up in the next data download :)

wkyoshida commented 6 months ago

It was decided in the dev sync to go ahead and already at least implement the first idea proposed in this issue:

  • For nouns, verbs, and prepositions, this is likely the Wikidata lexemes.

Created a different issue, #101, to track the work for this and actually decided to leave this issue open to continue the discussion on potential ideas for the second point:

  • For translations, autosuggestions, and emoji keywords...

Grabbing the lexemes though will already be a useful addition :grin: