dpriskorn / LexUse

Scripts related to Wikidata Lexemes and getting usage examples from different CC0 data sources.
GNU General Public License v3.0
1 stars 4 forks source link

Feature request: randomize lexemes instead of forms #7

Open Ainali opened 3 years ago

Ainali commented 3 years ago

When the tool is adding just one example on a lexeme and then moves on, the feeling of just poking around instead of doing useful work is huge. I would rather have it go through all the forms on one lexeme before moving on to the next one. This will also have the added benefit that when I encounter a lexeme that has been edited by this tool I can be sure that there are no easy hits in the databases it uses.

dpriskorn commented 3 years ago

I agree that would be an improvement. Right now we only get forms and then randomly choose from that. It would require another query. Pseudo code could be: find lexemes in Swedish which has forms, but some of the forms do not currently have examples. Then loop through them Get all forms of the current one without an example loop through and search examples

Ainali commented 3 years ago

Here is a starting point: https://w.wiki/zJi It gets lexemes in Swedish that have forms and senses but no examples at all. It can be improved on by also including those where some forms are still without examples but this already have >2800 hits, so we can get a lot done just with this.