scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
30 stars 69 forks source link

Implement command structure for the CLI #136

Closed andrewtavis closed 4 months ago

andrewtavis commented 5 months ago

Terms

Description

This issue would implement a command structure for the CLI for Scribe-Data with the following commands:

Based on the discussion in the sync, I think that it makes sense for us to stick with argparse as it's already being used in the codebase :)

Note that this issue is solely for implementing the commands themselves, with individual issues for functionality following from there! Note further that we need to chat directory structure and where these commands are being defined. I'll be reaching out to people on this 😊

Contribution

Note that this issue is a part of GSoC 2024 and thus will be assigned to @mhmohona 🥳 Please write in so we can assign :)

andrewtavis commented 5 months ago

We can get started on this once #134 is done. Let's also include #125 for GSoC, but we can work on that later as we reimplement some things :)

mhmohona commented 5 months ago

I would like to work on this issue.

andrewtavis commented 5 months ago

Thanks @mhmohona! ☀️☀️

mhmohona commented 5 months ago

Here, for retrieving data, shall I do it from this folder - https://github.com/scribe-org/Scribe-Data/tree/main/language_data_export?

andrewtavis commented 5 months ago

Yes exactly, @mhmohona! The formatted_data directories have been moved out of the language folders, and in a bit we'll remove the language_data_export directory and replace it with what the user passes to the CLI :) Do you have an idea for the baseline file name? Maybe we can do something like scribe_language_data_export with the given languages and word types inside? But then they could of course choose their own name as well :)