Closed windowshopr closed 5 years ago
As a follow up, I found this that might be helpful to someone who would know how to implement it?
https://stackoverflow.com/questions/53946083/setting-a-proxy-for-pandas-datareader
Hello @windowshopr, thanks for opening the issue -- what you are describing seems pretty doable. I implemented #55 as a POC where proxied sessions are created instead of requests that can be used directly with pandas. Let me know if thats what you had in mind.
@pgaref That looks exactly like what I'm after!
I am running into an issue though with the line:
self.userAgent = UserAgentManager(file=os.path.join(os.path.dirname('__file__'), './user_agents.txt'))
...which gives me an error:
TypeError: __init__() got an unexpected keyword argument 'file'
...so should I just take that out and leave UserAgentManager() as an empty function? Thanks!
Hey @windowshopr -- the path should be relative so it should be something like: https://github.com/pgaref/HTTP_Request_Randomizer/blob/78b305a3440f33cfd0caddb8ddf41b5eea974c68/http_request_randomizer/requests/proxy/requestProxy.py#L38
If you use the default (empty) UserAgentManager constructor, it will use the fake-useragent which is also fine (you will notice some log.warn messages)
Other than that let me know if you face any other issues and I can add the pandas-session functionality with some tests in the next release.
Right on! I was able to get it working. Thanks a lot for the help!
Would love some input on how to make that work, specifically when using DataReader and the Yahoo Finance API to get stock data. I can make requests for stock data using the DataReader once, and then after that I get an error, until the next day. My code looks simple:
So how could one integrate the http randomizer into that? I tried playing around with it a bit but couldn't figure it out. Something like replacing the url used in the request with the datareader somehow? If that makes sense?