scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
30 stars 69 forks source link

Implement the CLI `list-languages` and `list-word-types` functionality #148

Closed andrewtavis closed 4 months ago

andrewtavis commented 5 months ago

Terms

Description

This issue is already included in the work in #140, but making this to document the changes being made. The list-languages functionality of the Scribe-Data CLI will provide a list of languages that the user can get lexeme data from Wikidata with in alphabetical order. The list-word-types command would return all possible word types available if no --language argument is passed, or if there is one then the results would be filtered for just the available word types of that language (or languages).

Contribution

@mhmohona has already implemented the changes for this in #140 โ˜€๏ธ๐Ÿ˜Š Can you write in here so I can assign! Just a quick note, making issues like this so you can document some of the work you've been doing on Phabricator! ๐Ÿ‘

mhmohona commented 5 months ago

I have updated the #140 more as per this issue's requirement. Now we have following functionalities-

Functionality of list_word_types:

Functionality of list_languages:

image

andrewtavis commented 5 months ago

Thanks for this, @mhmohona!

andrewtavis commented 5 months ago

Note from #140 where I was thinking about this:

scribe-data list --language  # list all languages
scribe-data list --word-type  # list all word types
scribe-data list --language German --word-type  # list all German word types
scribe-data list --language --word-type nouns  # list all languages that you can get nouns for

Open to how this could improve further!

mhmohona commented 5 months ago

@andrewtavis,I have updated the commands - image

andrewtavis commented 4 months ago

Closed by #140 ๐Ÿš€ Thank you, @mhmohona! Here's a quick note on what I changed the outputs to, btw:

(venv) scribe-org/Scribe-Data ยป scribe-data l

Language     ISO  QID    
-----------------------
English      en   Q1860  
French       fr   Q150   
German       de   Q188   
Italian      it   Q652   
Portuguese   pt   Q5146  
Russian      ru   Q7737  
Spanish      es   Q1321  
Swedish      sv   Q9027  
-----------------------

Available word types: All languages
-----------------------------------
nouns
prepositions
translations
verbs
-----------------------------------

Again, we'll need to figure out a way to get emoji_keywords and autosuggestions back into the word types, which might require us to but a directory for them under each language for which they're implemented :) That'd be ok, I think ๐Ÿ˜Š

First fully functional command! Great work so far!