Closed shane-kerr closed 3 years ago
Thanks Shane for this massive work. I started to test this yesterday, but have not yet being able to validade the whole thing. Please gimme some extra days.
(I also had to install pyhon 3.9 and some libs, but it's OK)
I mean, this commit is very straightforward, but the other needs more careful evaluation from our side
If you can do this as a separate PR, I can approve it right away
Sure, makes sense! I'll make a separate PR presently.
thanks @shane-kerr
The CyclicDetector code is I/O bound, waiting on DNS recursion.
If run independently, it always ran 5 workers. I modified this to allow this as a command-line argument. (I decided not to link this to the number of cores.)
I was unable to get more than 20 or so workers to run with the multiprocessing approach. Changing this to use normal threads did allow this limit to be bypassed. However, I decided to do a slightly larger modification and switch to asyncio instead, since the process doesn't have to do much work. Using the asyncio version I was able to get 200 concurrent workers running.
Full list of changes:
Note that this version does end up being CPU bound in the end, so maybe a hybrid asyncio/multiprocessing model (like 150 workers per core) would be best, but I was able to complete a scan of the 40000 or so name servers that I'm currently worried about about in 2.5 minutes, which was fast enough for now.