Implement CLI `--total` functionality

mhmohona commented 4 months ago

Contributor checklist

[x] This pull request is on a separate branch and not the main branch

Description

Implemented the --total (-t) functionality would check Wikidata for the total of certain groupings of languages and word types.

Related issue

Fixes - #147

github-actions[bot] commented 4 months ago

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. Also consider joining our bi-weekly Saturday dev syncs. It'd be great to have you!

Maintainer checklist

[x] The commit messages for the remote branch should be checked to make sure the contributor's email is set up correctly so that they receive credit for their contribution
- The contributor's name and icon in remote commits should be the same as what appears in the PR
- If there's a mismatch, the contributor needs to make sure that the email they use for GitHub matches what they have for git config user.email in their local Scribe-Data repo
[x] The linting and formatting workflow within the PR checks do not indicate new errors in the files changed
[ ] The CHANGELOG has been updated with a description of the changes for the upcoming release and the corresponding issue (if necessary)

mhmohona commented 4 months ago

It still has a problem. It is recognizing data-type as language like this -

andrewtavis commented 4 months ago

Any thoughts on what's causing it, @mhmohona? :)

andrewtavis commented 4 months ago

Minor comments so far:

Let's include the language and the data type in the output
From @wkyoshida, something like:

Language: German
Data type: Verbs
Total number of lexemes: 999

Let's move data_type_to_qid to the data_type_metadata.json file
Let's reference the language_metadata.json file for the language_to_qid information

We'll be good to go after all this! 🥳

andrewtavis commented 4 months ago

Quick check here @mhmohona, are you planning on getting to the changes we mentioned above?

wkyoshida commented 4 months ago

It still has a problem. It is recognizing data-type as language like this -

Doesn't this work since -l German is passed first? Could just be that the first argument is recognized and used. The second may have been passed, but it's disregarded.

Looks like there is a choices parameter that we can pass in to add_argument() to specify what are the allowable options for an argument. Maybe we look if specifying this could make sense as the mechanism for controlling valid/invalid inputs?

mhmohona commented 4 months ago

Currently stuck in this comment 😢

Let's move data_type_to_qid to the data_type_metadata.json file

andrewtavis commented 4 months ago

By this I mean let's move the functionality of data_type_to_qid to the data_type_metadata.json file such that we just import the data at the top of the file rather than using a custom function :)

mhmohona commented 4 months ago

It still has a problem. It is recognizing data-type as language like this -

Doesn't this work since -l German is passed first? Could just be that the first argument is recognized and used. The second may have been passed, but it's disregarded.

Looks like there is a choices parameter that we can pass in to add_argument() to specify what are the allowable options for an argument. Maybe we look if specifying this could make sense as the mechanism for controlling valid/invalid inputs?

@wkyoshida, thank you for looking into it. I have solved this problem.

mhmohona commented 4 months ago

By this I mean let's move the functionality of data_type_to_qid to the data_type_metadata.json file such that we just import the data at the top of the file rather than using a custom function :)

I need help with the QID. :( Unable to find the correct QIDs. It would be super helpful for me if you would update the data_type_metadata.json file with QIDs @andrewtavis.

mhmohona commented 4 months ago

I

Minor comments so far:

Let's include the language and the data type in the output

From @wkyoshida, something like:
Language: German
Data type: Verbs
Total number of lexemes: 999
Let's move data_type_to_qid to the data_type_metadata.json file

Let's reference the language_metadata.json file for the language_to_qid information

We'll be good to go after all this! 🥳

I have addressed 1st and 3rd feedback from here.

scribe-org / Scribe-Data