zengfung / ClashOfClansScraper

1 stars 0 forks source link

Scraping of data using concurrency #17

Open zengfung opened 1 year ago

zengfung commented 1 year ago

Scraping of data should be done in parallel if possible. For example, scraping troop data is independent of scraping location/clan data, so they should be done simultaneously if possible.

zengfung commented 1 year ago

PR #23 enables multithreading capabilities when writing/upserting data to Azure Table Storage.

Next step is to experiment to see if multiprocessing can be done on the data processing side to speed up CPU intense computations.

zengfung commented 1 year ago

Seeing that the average scrape time per player is ~1-2 seconds, and multithreading is able to handle the I/O delay really well resulting in 90% speed up, multiprocessing may not be necessary.