jamesturk / scrapelib

⛏ a library for scraping unreliable pages
https://jamesturk.github.io/scrapelib/
BSD 2-Clause "Simplified" License
208 stars 40 forks source link

use params in caching #164

Closed fgregg closed 2 years ago

fgregg commented 2 years ago

Right now, this request

self.get('http://webapi.legistar.com/v1/chicago/matters?$skip=1000')

would have the following request key

"http://webapi.legistar.com/v1/chicago/matters?$skip=1000"

but

self.get('http://webapi.legistar.com/v1/chicago/matters', params={'$skip': 1000})

would have the request_key of

"http://webapi.legistar.com/v1/chicago/matters"

This behaviour doesn't make a lot of sense. This PR would have the effect of changing the request key of

self.get('http://webapi.legistar.com/v1/chicago/matters', params={'$skip': 1000})

to

"http://webapi.legistar.com/v1/chicago/matters?$skip=1000"

if that's undesirable for some reason, then we should maybe return a request_key of None, since we probably never want to have the request_key be the url with no arguments.

would close #110.

jamesturk commented 2 years ago

Thanks for this, completely unintentional. I'll push out a new point release soon