Closed ghost closed 7 years ago
Is there a way to export the tables as CSV files instead of database? Right now the contents for each article can be saved as files in a local directory.
Instead of ex = NewsCorpusGenerator(corpus_dir,'sqlite')
. You would specify as ex = NewsCorpusGenerator(corpus_dir)
We can add a method to just export the results in the database to CSV but you can already do that independent of the module since you have access to the SQLite file. You can also use http://sqlitebrowser.org/
Are you planning to add more news sites like yahoo news, bing news, reuters etc I started with that in mind. The plan was to add a variety of news sources but at the time I had to focus on the main project which used the generated corpus.
Bing News.
https://datamarket.azure.com/dataset/5BA839F1-12CE-4CCE-BF57-A49D98D29A44
Yahoo News
Most of Yahoo search APIs are now discontinued and we will need to figure out the pagination behind retrieving additional results
https://developer.yahoo.com/search/news/V1/newsSearch.html https://developer.yahoo.com/boss/search/
Other Sources
We can add any other sources once it can be done programmatically
I did not push to integrate them on the current release for the above reasons. However , I would love to see additional sources added and I am open to a pull request if you are interested.
Dear sir, I am not much familiar with Python but know how to open the Jupyter notebook. Can u pls tell me how to download ur code and run in jupyter or any text editor (or suggest any video). I am a corpus research and totally new to programming and finding myself helpless. Thanks in advance.
Hello umesh, I am not the programmer for this project and also this is not the right place to ask such trivial question. Without any basic understanding of programming it would be very difficult for you to use this module. I would recommend you to learn the basics of programming. This site https://www.learnpython.org/ is a best place to learn basics of Python programming, they also have android apps https://play.google.com/store/apps/details?id=com.sololearn.python
Hope this helped
Is there a way to export the tables as CSV files instead of database? Are you planning to add more news sites like yahoo news, bing news, reuters etc