Possible refactor? downloads + post requests

tarakc02 commented 1 month ago

utils.get_url and utils.post_url contain a lot of duplicated code, we thought we could simplify by turning them into one method.

naumansharifwork commented 1 month ago

@tarakc02 @stucka @newsroomdev Our current download method uses the get_request , where ever we need to download the files with the post_request currently we are writing them into respective scrapers which is causing a bit of extra code.

There are 2 ways that i proposed that will help us to refactor this

FIrst is In the arguments of the download function we pass another argument called request_method with the default value of GET. and apply an if condition in the download function to use the get_function when there is no value passed or get is passed, Otherwise use the post_request function
The second way is we create a separate download function in utils that uses the post method, we can name the function something as download_post or something more clear.

Let me know what others think about this, or any other ideas.

stucka commented 1 month ago

A lot of the time it's going to be a one-line fix, to call cache.write_binary to save the contents. But that skips the download functionality of force, which is handy.

@naumansharifwork , I like your first proposal -- it's elegant and clean and makes a ton of sense. as long as a "data"/payload can be passed through kwargs, I think it's great.

I'd also again urge we consider bumping up the download buffer size to something like 1024*1024 from the existing 8192. =)

stucka commented 1 month ago

Oh, and if we're stripping any functionality of existing libraries we'll need to know what's using that functionality and how/when/who refactors the code. If the code is stable and fully functional and already used, there might not be a real reason to remove it ... ?

biglocalnews / clean-scraper

Possible refactor? downloads + post requests #118