siznax / wptools

Wikipedia tools (for Humans): easily extract data from Wikipedia, Wikidata, and other MediaWikis
MIT License
575 stars 79 forks source link

asyncio support #152

Open uriva opened 4 years ago

uriva commented 4 years ago

Slightly related to https://github.com/siznax/wptools/issues/147, but thought it deserves its own issue.

siznax commented 4 years ago

Thanks @uriva Could be nice, but please see #141

uriva commented 4 years ago

I don't see how this is related, asyncio can help you make parallel requests to two websites, or do cpu intensive work while waiting for io.

siznax commented 4 years ago

@uriva You are welcome to submit a PR for this. We'll be interested in ensuring we do not make it easy for our users or this package's user-agent to get banned.

Alternatively, you are welcome to run this package in parallel in your own programs. Why don't you try that first, and see if it is something that may help to have incorporated in this package?

Also, it would help to tell us more about the exact problem you are trying to solve. Maybe there is already a solution.

Good luck!

uriva commented 4 years ago

My scenario is that I have an IO rich code doing many requests to different rest apis. The entire process is highly time sensitive, so the requests must happen in parallel. The architecture is the code is such that the requests are not coming from a single function but from various places. This means I can't rely on threading/multiprocessing (that would require a global thread pool which is awkward and deadlock prone, as the requests have delicate dependencies). asyncio works great in this scenario, but requires native support of the library (cannot be wrapped from outside). This is because the nature of asyncio in python is cancerous.

AmoghM commented 3 years ago

@siznax I also realised the need for asyncio while extracting pages because function calls such as page.get() is very slow. It can definitely be useful to have the enhancement and put a warning on being getting the user-agent banned.