Closed ExperimentsInHonesty closed 3 years ago
I'm tied up this week until Feb 2, but after that I'm pretty open.
I'm available Mondays 9AM-4PM; and weekday nights 8:30PM-10:30PM.
@chrislopez28 I can do Monday at 10am if that works for you. But I think @chombus is not available during the day. But I think we can review this without him for now. I'll reach out to you on slack to confirm your availability for this Monday.
Experimenting with matching potential duplicate entries between scraped files.
Pantries Food Finders and LMS. Closest entry by haversine ("as the crow flies") distance in meters: https://github.com/chrislopez28/fola-data-normalization/blob/master/export/pantries_dist.csv
Farmers Markets CalFresh and LMS. Most similar name string by finding entry with minimum levenshtein distance: https://github.com/chrislopez28/fola-data-normalization/blob/master/export/markets_stringdist.csv
See these links for references about what type of information we are collecting stakeholder details - part of current design efforts: https://github.com/hackforla/food-oasis/issues/178#issuecomment-565886143
see jobs-for-hope repo backend for scraper examples
Discussion between @chrislopez28and @ExperimentsInHonesty on, determined that we need more data for him to work with. So he is going to get some scrapers working, starting with issue #95. We will come back to this issue, once more of the scrapers are active.
Progress:
To Do:
Overview
We have scraped a lot of data from various website. We need to see if it's possible to normalize it into one data set, from which to verify the listings
Action Items
Please add your availability for a zoom meeting to the comments.
Resources/Instructions
Scraped resources google drive folder