Closed karel1980 closed 2 months ago
Note: code is still 💩 but at least the tests pass (unless recent plenaries were added to dekamer.be, then there will be >300 plenaries) - we could check if the number of plenaries === number of files instead.
Hey @karel1980, thanks! I will review your work tomorrow. :-) I see we started working in parallel. Don't worry, I'll merge my work onto yours.
Scripts were updated to expect data/input/pdf and data/input/html but download script wasn't changed.
Extracting proposal description would make html parsing feature complete and theoretically allow to replaced and remove PDF parsing next