Closed Johnny-Courage020 closed 6 years ago
This isn't an issue?
Either use a call back:
def loaded(response, *args, **kwargs):
html_tree = html.fromstring(response.content)
#add your processing
unsent_request = (grequests.get(url, hooks={'response': Loaded'}) for url in urls)
or just iterate the responses:
for response in results:
html_tree = html.fromstring(response.content)
I've not tested either of those but either should work.
Thanks for the reply! And I'm sorry, no I guess it's not issue. Is there a different github functionality I should use when asking a question?
Hey @Johnny-Courage020 use stackoverflow for questions like this. Also consider closing this issue.
Hi, I'm very green when it comes to python (or programming in general), but I've written a script that parses about 16000 urls and extracts some values from each page's html.
I'm trying to do this asynchronously with grequests, but I'm having a hard time understanding how to get the acutal html code, simultaneously.
to do this synchronously I'm using the following command:
` for url in list_of_urls:
where 'name' is one of the things I'm extracting, there's more but I don't think you'd appreciate me blasting this issue with useless code. Needless to say the above code takes a fuckton of time to process 16000 urls :p
As far as my (very very limited) understanding goes, grequests does:
unsent_request = (grequests.get(url) for url in urls)
creates a list of unsent requests andresults = grequests.map(unsent_request)
issues all of the requests at the same time and waits for all of them to complete.How do I get an html_tree into Python, for parsing purposes, using grequests?
Thanks a million^million