disinfoRG / FbScraper

MIT License
3 stars 2 forks source link

若 db connection 中斷,discover 不會自動停止 #13

Closed andreawwenyi closed 4 years ago

andreawwenyi commented 4 years ago

command: python discover.py -c when db connection showed : packet_write_wait: Connection to 172.104.98.144 port 22: Broken pipe the discover process does not finish and here's a partial log:

crawler_timestamp_1580707171: viewed 799 posts, add 0 new posts, existing 830 posts in database, empty response count #0 
pipeline_timestamp_1580707208: insert to database, table = Article, id = [1097750], data = {'first_snapshot_at': 0, 'last_snapshot_at': 0, 'next_snapshot_at': -1, 'snapshot_count': 0, 'url_hash': 3792475091, 'url': 'https://www.facebook.com/fuqidao168/posts/1768153853444519', 'site_id': 33, 'article_type': 'FBPost', 'redirect_to': None} 
pipeline_timestamp_1580707208: insert to database, table = Article, id = [1097751], data = {'first_snapshot_at': 0, 'last_snapshot_at': 0, 'next_snapshot_at': -1, 'snapshot_count': 0, 'url_hash': 2331910823, 'url': 'https://www.facebook.com/fuqidao168/posts/1768322396760998', 'site_id': 33, 'article_type': 'FBPost', 'redirect_to': None} 
pipeline_timestamp_1580707209: insert to database, table = Article, id = [1097752], data = {'first_snapshot_at': 0, 'last_snapshot_at': 0, 'next_snapshot_at': -1, 'snapshot_count': 0, 'url_hash': 2484461609, 'url': 'https://www.facebook.com/fuqidao168/posts/1768153670111204', 'site_id': 33, 'article_type': 'FBPost', 'redirect_to': None} 
pipeline_timestamp_1580707209: insert to database, table = Article, id = [1097753], data = {'first_snapshot_at': 0, 'last_snapshot_at': 0, 'next_snapshot_at': -1, 'snapshot_count': 0, 'url_hash': 1321297792, 'url': 'https://www.facebook.com/fuqidao168/posts/2454242741502290', 'site_id': 33, 'article_type': 'FBPost', 'redirect_to': None} 
crawler_timestamp_1580707209: viewed 803 posts, add 4 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707246: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707283: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707319: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707356: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707393: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707430: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707466: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707503: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707540: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707577: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707614: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707650: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707687: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707724: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707761: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707798: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707834: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707871: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707908: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707944: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580707981: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580708018: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580708055: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
crawler_timestamp_1580708092: viewed 803 posts, add 0 new posts, existing 834 posts in database, empty response count #0 
dieface commented 4 years ago

resolve by this PR #23