datameet-pune / datameet-pune.github.io

Common repo and documentation space for DataMeet Pune chapter
https://sites.google.com/view/datameetpune/home
GNU General Public License v3.0
16 stars 20 forks source link

Bank accounts data from PMJDY website #10

Open answerquest opened 6 years ago

answerquest commented 6 years ago

Datameet group thread: https://groups.google.com/forum/#!searchin/datameet/pdfs%7Csort:date/datameet/ErNY82gA7dw/mmBUxH5DAgAJ

Site: https://www.pmjdy.gov.in/archive

dhaneshsabane commented 6 years ago

The website gives you an option to export the data in PDFs. Do we want to export it in that format or use the data on the webpage to create a CSV or JSON file?

answerquest commented 6 years ago

@Dhanesh95 we want to get the numbers out in a way that they can be combined across time. I'd pitch for scraping into JSON as the data will likely get hierarchical when we combine it across different dates.

Note: the data seems to be available for each wednesday only. And possibly some dates data may not be available. So the scraper will need to be able to handle that.

dhaneshsabane commented 6 years ago

@answerquest I was thinking on the same lines. Generating a JSON file becomes highly convenient as it can be converted into any other data format we want. I also noted that the data is available for each Wednesday and I'm confident I can build a scraper for this use case. Do you mind if I get started on this right away? Maybe we can finish it off at the hackathon.

answerquest commented 6 years ago

@Dhanesh95 sorry just seeing this now, on the day of the Hackathon :laughing:

dhaneshsabane commented 6 years ago

A basic scraper for the website is ready which can generate a CSV file of all the data from the website. You can find the code on this link - https://git.fosscommunity.in/Dhanesh95/pmjdyScraper

dhaneshsabane commented 6 years ago

@answerquest The work here is not 100% complete and I have a few ideas in mind that I'd like to implement. Please assign this issue to me.

answerquest commented 6 years ago

@Dhanesh95 ok done

answerquest commented 5 years ago

Suggestion: can visualize using Highcharts: https://www.highcharts.com/stock/demo