Closed danielnaab closed 4 years ago
This repo has some good exploration of the raw FAC data via IPython notebooks:
https://github.com/irenatfh/fac
This repo scrapes data, and while it doesn't look useful to us, is worth looking over:
Thanks @danielnaab. Does this complete this issue? (we can chat about it in our meeting tomorrow if easier)
@cantsin the second link has somethings about downloading FAC data and renaming it that might be worth 👀
@bpdesigns I'd still like to walk through the Selenium code and poke around that a bit, to understand how the site is handling the view state.
Holding off on this issue for now pending conversations about data access next week.
@danielnaab wrote a crawler to understand the data
User story
As a new developer, in order to make informed decisions on how to continue development on the Distiller, I would like to understand how data is being scraped from the source system and the underlying data model.
Notes
This task is to review the Selenium scraper and the existing documentation, and ask questions to stakeholders.
Acceptance criteria