Data4Democracy / internal-displacement

Studying news events and internal displacement.
43 stars 27 forks source link

Cleanup imported modules, improve processing of dates, locations quantities #113

Closed simonb83 closed 7 years ago

simonb83 commented 7 years ago

This PR makes some changes to Pipeline, Interpreter and Scraper:

  1. Separation of Scraping, Parsing and Database interaction
  2. All DB interaction takes place in Pipeline
  3. Use ExtractedReports wrapper for extracting reports (instead of Report)
  4. Convert extracted quantities into integers
  5. Placeholder for turning extracted dates into absolute DateTimes
  6. Create DateSpans for reports based upon Min and Max DateTimes
  7. Update unit tests for Pipeline.
simonb83 commented 7 years ago

Hey....I think I see what happened here....I merged a subsequent PR without checking it properly and it turned out that it was based on an earlier version of the repo. I will try and clean this up a bit later on today.

simonb83 commented 7 years ago

Hi @alexanderrich you were correct the first time and I had forgotten to include the file. I think this is fixed now with PR #124 but let me know any other issues you have! Thanks.

alexanderrich commented 7 years ago

Everything works now, thanks!