pythonhacker / harvestman-crawler

Automatically exported from code.google.com/p/harvestman-crawler
1 stars 3 forks source link

Killing harvestman with "ctrl+C" #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.
2.
3.

What is the expected output? What do you see instead?

Please use labels and text to provide additional information.

Original issue reported on code.google.com by szybal...@gmail.com on 28 Jun 2008 at 6:13

GoogleCodeExporter commented 9 years ago
When I was crawling a website I wanted to stop the download. 
I hit CTRL + C, and I get:
[01:12:54] 4120 links scanned in 1 server .
[01:12:54] 69 files written.
[01:12:54] 1450624  bytes received at the rate of 8.12 KB/sec .
[01:12:54] 9498335  bytes were written to disk.

[01:12:54] *** Log Completed ***

Writing project statistics to crawl database... 
Done. 
HarvestMan session finished. 

At this time the program seem to be running and its not exiting. It finished 
but did
not exit. The only way to get out is to press ctrl+x to suspend it and use 
"kill #"
command.

^X
[3]+  Stopped                 harvestman -C automotive.xml

 ps -A|grep harvestman
 5345 pts/0    00:00:00 harvestman
 5385 pts/0    00:00:00 harvestman
 5626 pts/0    00:00:00 harvestman

kill 5345
kill 5385
kill 5626

Any ideas why. Is there an exception catched. If there is we should provide 
something
like "press ctrl +c" again to exit or ask them if they want to quit/exit. 

Original comment by szybal...@gmail.com on 28 Jun 2008 at 6:19

GoogleCodeExporter commented 9 years ago
Fixed this. When I committed a previous change, the code which handled an 
interrupt
condition (endloop in urlqueue.py) got modified, which resulted in the threads 
not
being killed for a SIGINT (KeybordInterrupt/Ctrl-C). This has been fixed and 
Ctrl-C
works fine, killing all threads and bringing the program to a stop.

Original comment by abpil...@gmail.com on 1 Jul 2008 at 9:38