Open AindriyaBarua opened 2 years ago
Yup sounds interesting, go for it :) Is it possible to extend the functionality for any given script language?
It is, as long as Googlenews supports that language, but we would have to have dataset/pre-trained word embedding model of that language.
I cannot seem to be able to find the languages supported by googlenews python library, could you please try looking it up? However, I did find some pre-trained word embedding models we could use.
After checking the library's code, it looks like it uses all Google supported languages since it essentially uses requests to fetch data. All the languages can be found in this link. As for the pretrained models, if a model for a language exists, load that model. Else default to Tfidf based vectorization techniques.
As of now, I am working on enabling Hindi, I will keep it extensible for any language, and extend it later. I am using pretrained Hindi Fastext model, as it is better for Indian languages, as established here
Alright sure go ahead!
Hello @aditeyabaral I would like to add a feature enhancement to this application, making it able to also search for queries in Hindi (Devanagari script), and output summaries from Hindi news articles. If you are interested in that feature, I will start working on it. Do let me know :)