scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
30 stars 69 forks source link

Add loading and exporting functions to utils #83

Closed shashank-iitbhu closed 8 months ago

shashank-iitbhu commented 8 months ago

Contributor checklist


Description

Added common functions for loading and exporting data to utils.py. Added get_language_dir_path, load_queried_data, export_formatted_data in utils.py .

Related issue

github-actions[bot] commented 8 months ago

Thank you for the pull request!

The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :)

If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you!

Maintainer checklist

shashank-iitbhu commented 8 months ago

Hey @andrewtavis ! Added these three functions get_language_dir_path, load_queried_data, export_formatted_data in utils.py . Also used them in the format files in the latest commit. The update_data.py is querying and formatting the data of these languages as expected even after the changes. Please review whenever you get time.

wkyoshida commented 8 months ago

Hey @shashank-iitbhu! :wave: Sorry - just to confirm with you, is this PR good now to start reviewing? Just wanted to check first due to the "[WIP]" still in the PR title :smile:

shashank-iitbhu commented 8 months ago

Hey @shashank-iitbhu! 👋 Sorry - just to confirm with you, is this PR good now to start reviewing? Just wanted to check first due to the "[WIP]" still in the PR title 😄

Yes, this can be reviewed. Haven't added the doc strings yet. Also, let me know if tests should be added for these functions?

shashank-iitbhu commented 8 months ago

The newly added function get_language_dir_path is also required by #88 and #89 . So when this branch is merged, I will have to rebase from main and make changes to PR 88 and 89.

andrewtavis commented 8 months ago

This is really great, @shashank-iitbhu :) Minor inline comments coming shortly 😊

andrewtavis commented 8 months ago

Am thinking that the above changes should be pretty straightforward, @shashank-iitbhu 😊 Happy to merge once they come through!

andrewtavis commented 8 months ago

I added in a couple of changes here to bring our tooling a bit into the next generation: specifically adopting Ruff over Black for formatting. @shashank-iitbhu, I'd suggest you to install the VS Code extension for Ruff so that formatting is taken care of for you :)

Aside from this the only changes to the committed code was to make all arguments for the functions be explicitly passed 😊