Closed ExperimentsInHonesty closed 3 years ago
Spoke to Chris and Natesh about how to merge scraped data with open source data verification tool. Nat has found one. Need to discuss it. He also suggested some other tools to help us that he used in his company.
A note from Natesh: A few open source tools worth exploring:- 1) https://github.com/arviedelgado/roro - This is an open source Robotic Process Automation (RPA) that emulates human to go to a website, scrape data and help compile a dataset. 2) https://sourceforge.net/projects/dataquality/ - This is for data profiling when we collect data sets and want to identify records that are duplicate with different addresses, invalid addresses, different coordinates etc. This "potentially" could help reduce the amount of manual effort required to clean up. [I haven't personally used these tools but the corporate cousins of these are used in Enterprises regularly. Potentially these could help reduce some manual effort].
I forsee a group merging the data that will later be verified by volunteers. Volunteers will be recruited on social media, on a webinar and also somehow in groups. I believe we need several ways to verifiy this data.
Beofre all of the data is verified, I'd like to go live somehow with the verified data we have at this time. How can we seperate the verifed data and getting it up live for us to test it with users? Since we don't have a staging site, maybe we can up load the site here.
Progress this week:
Next Steps:
Follow-up on Open Source Tools:
Checked out the RPA tool Roro. It looks like a potential non-coding option to do scraping, however I wasn't able to get it working on my PC. This could be something to explore in other Hackforla projects. As we already have webscrapers in Python and Javascript for FOLA, I don't think we need this.
I dl'ed the dataprofiling tool however wasn't able to get it to connect to the Postgres database. I will give it a try again later after we have some more progress on scraping and combining data.
Overview
We need to have a weekly record of what team progress, Blocks, Availability & Eta to completion for each volunteer and for the project as a whole.
Action Items
Bonnie added a request for an update to each of the in progress issues on the Data Verification Board
@chombus
@chombus & @jpmikesell On this issue, it is pointed out that sometimes they ask us to call back. Do we have a script for asking if there is someone we can email, when we are calling back a second time? And do we have a script already setup for that? Gabby's issue - https://github.com/hackforla/food-oasis/issues/196