robbi5 / kleineanfragen

Collecting kleine Anfragen from Parlamentsdokumentationssystemen for easy search- and linkability
https://kleineanfragen.de
MIT License
43 stars 9 forks source link

Work with new Search + Detail Pages for Baden-Wuerttemberg #118

Closed ynux closed 5 years ago

ynux commented 5 years ago

They moved the search and detail pages in BW, "Einträge für die 16. Wahlperiode ab dem 9. März 2018 finden Sie ausschließlich dort." Older documents can be found in both places. Instead of sth like http://suche.landtag-bw.de/redirect.itl?WP=15&DRS=6432 you now have to get a report_id via post request to get a temporary URL like https://parlis.landtag-bw.de/parlis/report.tt.html?report_id=MjAxOTAxMDQtMjEwMDQzLTA4MDgtTEJXOnN1Y2hlcmdlYm5pcy1kb2t1bWVudG51bW1lcjpodG1sOjo6MTpzRE5SU08gc1JOUkRT and the structure of the detail page changed, too. This pull request is far from perfect. I only fixed the tests and the scraper, not the system. Known issues are:

  1. Possibly the changing detail link / report_id creates problems later in the system - there seems to be a timestamp in it
  2. Major appellations aren't covered
  3. The encoding on the new pages is broken, and left as it is
  4. The ministries now come with their full names. Splitting along "and" for multiple ministries doesn't work any more, and later processing will break
  5. 9 tests with anomalies are skipped
  6. The detail page links aren't tested (since they change)

Even with these known issues I'm submitting this now, to find out if this makes sense for the project.

robbi5 commented 5 years ago

Thanks for the PR 👍

Possibly the changing detail link / report_id creates problems later in the system - there seems to be a timestamp in it The detail page links aren't tested (since they change)

fixed, by using their quicklinks feature. found this in the html of the result page and played with the parameters.

I'll try to add support for major interpellations and the overview scraper in the next few days.