jamesturk / spatula

A modern Python library for writing maintainable web scrapers.
https://jamesturk.github.io/spatula/
MIT License
244 stars 11 forks source link

built-in support/examples for scraping ASP.net pages #24

Open jamesturk opened 3 years ago

jamesturk commented 3 years ago

A common challenge people run into is scraping pages built with ASP.net (or similar technology) where there is a ton of hstate kept in forms and cookies. I've had some success doing this with a custom source (https://github.com/openstates/openstates-scrapers/blob/main/scrapers_next/hi/people.py#L6) but we can formalize or at least document the approach.

If you're seeing this and would find it useful, it'd be great to get an example of a page that would need this technique as well as a 👍 or any other comments you have.

mscarey commented 3 years ago

Yes, I was wondering if this kind of approach could be used for search forms like the ones Tyler Tech uses. For instance, https://odysseypa.traviscountytx.gov/JPPublicAccess/Search.aspx?ID=900.