Closed KingAkeem closed 3 years ago
I’d like to tackle this—thinking multiprocessing.Pool.apply_async()
will help nicely here, but we’ll have to fetch the results from async
as they come in, can’t recall offhand if handling this with a for
left open for streaming data will work or not 🤷♂️
I'm working on this currently but my solution deals with using Golang. If you're able to write something in Python that has similar performance then that would be fine by me.
@KingAkeem - Golang would be easier for this for sure, but I think there may be a way to do it in Python.
How were you planning on tying this project to a Golang based multithreaded URL analyzer?
I'm going to try just using Go without any attempts at optimization since one of our bottlenecks is I/O, I think simply using Go to perform the request and print the status while provide a significant speed increase. After that, I'll run each request in a separate goroutine and take advantage of Go's built-in concurrency.
I have a somewhat working prototype currently. You can check our this PR: https://github.com/DedSecInside/TorBot/pull/174 and this repo contains the Golang code that's being used as a plugin https://github.com/KingAkeem/display_status
Sounds good, I'll un-assign myself from this issue. I'll take a look at the PR too 👍
Sounds good to me :+1:
Describe the bug We're using multi threading to display links and Python isn't great at handling threads due to the GIL so this causes stalling when trying to display several hundred (at minimum) links. I'm thinking that the solution in this case is to process the links asynchronously instead, it may not be as fast but it'll be stable.
To Reproduce
python3 torBot.py -u https://hiddenwikitor.org
Expected behavior LInks are retrieved until user cancels operation or all links have been found.