The documentation says that this might be mistake (but I think it is not in this case?). For each worker it causes an "ERROR" in the log, which is not really nice when wanting to report the errors.
I'm trying to stay up-to-date with the 500lines library, but somehow it hangs when crawling a huge website (just as the queue gets empty).
That's why I added the wait_for.
@ajdavis @asvetlov Do you have any idea how to prevent those tasks from spitting out Task was destroyed! at the end, or have an idea how to solve this issue?
It seems that when the queue is empty at the end (https://github.com/kootenpv/sky/blob/master/sky/crawler/crawling.py#L338), it will try to end the futures (that are wrapped in
wait_for
), and this causes the tasks to not end "normally".The documentation says that this might be mistake (but I think it is not in this case?). For each worker it causes an "ERROR" in the log, which is not really nice when wanting to report the errors.
I'm trying to stay up-to-date with the 500lines library, but somehow it hangs when crawling a huge website (just as the queue gets empty).
That's why I added the
wait_for
.@ajdavis @asvetlov Do you have any idea how to prevent those tasks from spitting out
Task was destroyed!
at the end, or have an idea how to solve this issue?I'd be really very grateful!