scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
30 stars 69 forks source link

Create needed queries for Bangla #134

Closed andrewtavis closed 5 months ago

andrewtavis commented 6 months ago

Terms

Description

This issue would create the needed Wikidata query service queries for Bangla, with specifically queries being made for nouns, verbs, prepositions, adjectives, and any other word type of interest. We can explore Wikidata a bit for this and decide what other word types would be good to include in the issue here :) Suggestions welcome!

A quick search is indicating that Bengali is the English name for Bangla, so work for this should go into src/scribe_data/extract_transform/languages/Bengali. Separate directories for word types should be added, and queries for each based on those for other langauges and based on available data can be made 😊

Contribution

Happy to support the work on this! 🚀

andrewtavis commented 6 months ago

CC @mhmohona, @wkyoshida and @henrikth93 :)

First GSoC issue of the program! ☀️ @mhmohona, can you write in here so I can assign you? 😊

mhmohona commented 6 months ago

I would love to work on this issue!

andrewtavis commented 6 months ago

Assigned! Thanks @mhmohona 😊 Let us know your thoughts on what word types we should be doing!

andrewtavis commented 5 months ago

2f301d2 sent along a minor edit to remove the query comments, @mhmohona :) I think that having the info in the header as you had it makes sense, and specifically Scribe-Data developers and users are able to see what the QIDs and PIDs are in various ways. For users, if someone were to say use the query on query.wikidata.org, then there are tooltips on hover to show the label. This functionality is also available in VS Code via the Wikidata QID Labels extension! If you're using VS Code, then I'd definitely suggest installing the extension, or if not we can look for another way of maybe getting them in :)

Thanks for this! ☀️