apache / incubator-stormcrawler

A scalable, mature and versatile web crawler based on Apache Storm
https://stormcrawler.apache.org/
Apache License 2.0
887 stars 262 forks source link

exited loop ! continuosly #440

Closed MyraBaba closed 7 years ago

MyraBaba commented 7 years ago

Hi,

We used storm crawler for testing in January and that time we used through creating uber jar . Flux . not configured than.

last day we updated storm crawler and its changed to to flux and couldnt create uberjar with old way.

So we used flux . and injected urls..

Starting to crawl...

But it is always exiting from loop in a few minutes. Didnt know what..

here the last log:

` 63501 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.yenicaggazetesi.com.tr/sinema-haberleri-58hk.htm with status 200 in msec 622 63505 [Thread-30-parse-executor[5 5]] INFO c.d.s.b.JSoupParserBolt - Parsing : starting http://www.yenicaggazetesi.com.tr/sinema-haberleri-58hk.htm 63509 [Thread-30-parse-executor[5 5]] INFO c.d.s.b.JSoupParserBolt - Parsed http://www.yenicaggazetesi.com.tr/sinema-haberleri-58hk.htm in 3 msec 63650 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 63650 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.motosiklet.net/forum/motosiklet-modelleri/?pp=20&daysprune=-1&prefixid=39_Kuba_Yuan_Motosiklet with status 200 in msec 324 63655 [Thread-30-parse-executor[5 5]] INFO c.d.s.b.JSoupParserBolt - Parsing : starting http://www.motosiklet.net/forum/motosiklet-modelleri/?pp=20&daysprune=-1&prefixid=39_Kuba_Yuan_Motosiklet 63660 [Thread-30-parse-executor[5 5]] INFO c.d.s.b.JSoupParserBolt - Parsed http://www.motosiklet.net/forum/motosiklet-modelleri/?pp=20&daysprune=-1&prefixid=39_Kuba_Yuan_Motosiklet in 4 msec 63669 [main] INFO o.a.s.l.ThriftAccessLogger - Request ID: 1 access from: principal: operation: killTopology 63681 [main] INFO o.a.s.d.nimbus - Delaying event :remove for 300 secs for crawler-1-1489779049 63692 [main] INFO o.a.s.d.nimbus - Adding topo to history log: crawler-1-1489779049 63697 [main] INFO o.a.s.d.nimbus - Shutting down master 63699 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63699 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490003 63700 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490003 closed 63700 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63700 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63700 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /0:0:0:0:0:0:0:1:54079 which had sessionid 0x15addc043490003 63700 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490004 63701 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490004 closed 63701 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63701 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /0:0:0:0:0:0:0:1:54080 which had sessionid 0x15addc043490004 63701 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63701 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490000 63702 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490000 closed 63702 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63702 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54077 which had sessionid 0x15addc043490000 63702 [main] INFO o.a.s.zookeeper - closing zookeeper connection of leader elector. 63702 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63702 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490001 63703 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490001 closed 63703 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63703 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54078 which had sessionid 0x15addc043490001 63703 [main] INFO o.a.s.d.nimbus - Shut down master 63703 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63703 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490006 63703 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490006 closed 63703 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63704 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54082 which had sessionid 0x15addc043490006 63704 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63704 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490008 63704 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490008 closed 63704 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63704 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54084 which had sessionid 0x15addc043490008 63705 [main] INFO o.a.s.d.supervisor - Shutting down supervisor 65050983-1686-4239-9c57-b66ea773bb1e 63706 [Thread-7] INFO o.a.s.event - Event manager interrupted 63706 [Thread-8] INFO o.a.s.event - Event manager interrupted 63706 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63706 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc04349000a 63707 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc04349000a closed 63707 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63707 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54086 which had sessionid 0x15addc04349000a 63711 [main] INFO o.a.s.d.supervisor - Shutting down 3d22da9e-e57a-4d44-9bb9-eb910686258d:50615cfd-191a-4432-8580-a5fcade0034f 63711 [main] INFO o.a.s.config - GET worker-user 50615cfd-191a-4432-8580-a5fcade0034f 63713 [main] INFO o.a.s.process-simulator - Killing process e285c014-a7d5-4cfe-977b-8bdc5e9478d5 63713 [main] INFO o.a.s.d.worker - Shutting down worker crawler-1-1489779049 3d22da9e-e57a-4d44-9bb9-eb910686258d 1027 63713 [main] INFO o.a.s.d.worker - Terminating messaging context 63713 [main] INFO o.a.s.d.worker - Shutting down executors 63714 [main] INFO o.a.s.d.executor - Shutting down executor spout:[8 8] 63715 [Thread-14-spout-executor[8 8]] INFO o.a.s.util - Async loop interrupted! 63715 [Thread-13-disruptor-executor[8 8]-send-queue] INFO o.a.s.util - Async loop interrupted! 63752 [main] INFO o.a.s.d.executor - Shut down executor spout:[8 8] 63753 [main] INFO o.a.s.d.executor - Shutting down executor metricscom.digitalpebble.stormcrawler.elasticsearch.metrics.MetricsConsumer:[2 2] 63753 [Thread-16-metricscom.digitalpebble.stormcrawler.elasticsearch.metrics.MetricsConsumer-executor[2 2]] INFO o.a.s.util - Async loop interrupted! 63753 [Thread-15-disruptor-executor[2 2]-send-queue] INFO o.a.s.util - Async loop interrupted! 63771 [main] INFO o.a.s.d.executor - Shut down executor metricscom.digitalpebble.stormcrawler.elasticsearch.metrics.MetricsConsumer:[2 2] 63771 [main] INFO o.a.s.d.executor - Shutting down executor sitemap:[7 7] 63771 [Thread-18-sitemap-executor[7 7]] INFO o.a.s.util - Async loop interrupted! 63771 [Thread-17-disruptor-executor[7 7]-send-queue] INFO o.a.s.util - Async loop interrupted! 63772 [main] INFO o.a.s.d.executor - Shut down executor sitemap:[7 7] 63772 [main] INFO o.a.s.d.executor - Shutting down executor fetcher:[3 3] 63772 [Thread-20-fetcher-executor[3 3]] INFO o.a.s.util - Async loop interrupted! 63772 [Thread-19-disruptor-executor[3 3]-send-queue] INFO o.a.s.util - Async loop interrupted! 63773 [main] INFO o.a.s.d.executor - Shut down executor fetcher:[3 3] 63773 [main] INFO o.a.s.d.executor - Shutting down executor acker:[1 1] 63773 [Thread-22-acker-executor[1 1]] INFO o.a.s.util - Async loop interrupted! 63773 [Thread-21-disruptor-executor[1 1]-send-queue] INFO o.a.s.util - Async loop interrupted! 63774 [main] INFO o.a.s.d.executor - Shut down executor acker:[1 1] 63774 [main] INFO o.a.s.d.executor - Shutting down executor partitioner:[6 6] 63774 [Thread-24-partitioner-executor[6 6]] INFO o.a.s.util - Async loop interrupted! 63774 [Thread-23-disruptor-executor[6 6]-send-queue] INFO o.a.s.util - Async loop interrupted! 63775 [main] INFO o.a.s.d.executor - Shut down executor partitioner:[6 6] 63775 [main] INFO o.a.s.d.executor - Shutting down executor status:[9 9] 63775 [Thread-26-status-executor[9 9]] INFO o.a.s.util - Async loop interrupted! 63775 [Thread-25-disruptor-executor[9 9]-send-queue] INFO o.a.s.util - Async loop interrupted! 63792 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.yeniasya.com.tr/dunya/bm-de-israil-baskisi-istifasi_426691 with status 200 in msec 5277 63807 [elasticsearch[Chrome][listener][T#2]] WARN c.d.s.e.p.StatusUpdaterBolt - Could not find unacked tuple for 124583fdd7b4381ce05520cfb834ace3112ff384a87550e599db05413e0e9362 63807 [elasticsearch[Chrome][listener][T#2]] WARN c.d.s.e.p.StatusUpdaterBolt - Could not find unacked tuple for 174540b1ff7f7bf70bef56c57cf29e100d6f8847e4b5d648608848303436c2db 63807 [elasticsearch[Chrome][listener][T#2]] INFO c.d.s.e.p.StatusUpdaterBolt - Bulk response 211, waitAck 0, acked 211 63810 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 63810 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.hurriyet.com.tr with status 200 in msec 330 63817 [main] INFO o.a.s.d.executor - Shut down executor status:[9 9] 63818 [main] INFO o.a.s.d.executor - Shutting down executor system:[-1 -1] 63818 [Thread-28-system-executor[-1 -1]] INFO o.a.s.util - Async loop interrupted! 63818 [Thread-27-disruptor-executor[-1 -1]-send-queue] INFO o.a.s.util - Async loop interrupted! 63819 [main] INFO o.a.s.d.executor - Shut down executor __system:[-1 -1] 63819 [main] INFO o.a.s.d.executor - Shutting down executor parse:[5 5] 63819 [Thread-30-parse-executor[5 5]] INFO o.a.s.util - Async loop interrupted! 63819 [Thread-29-disruptor-executor[5 5]-send-queue] INFO o.a.s.util - Async loop interrupted! 63820 [main] INFO o.a.s.d.executor - Shut down executor parse:[5 5] 63820 [main] INFO o.a.s.d.executor - Shutting down executor index:[4 4] 63820 [Thread-32-index-executor[4 4]] INFO o.a.s.util - Async loop interrupted! 63820 [Thread-31-disruptor-executor[4 4]-send-queue] INFO o.a.s.util - Async loop interrupted! 63841 [main] INFO o.a.s.d.executor - Shut down executor index:[4 4] 63841 [main] INFO o.a.s.d.worker - Shut down executors 63841 [main] INFO o.a.s.d.worker - Shutting down transfer thread 63841 [Thread-33-disruptor-worker-transfer-queue] INFO o.a.s.util - Async loop interrupted! 63842 [main] INFO o.a.s.d.worker - Shut down transfer thread 63842 [main] INFO o.a.s.d.worker - Shut down backpressure thread 63843 [main] INFO o.a.s.d.worker - Shutting down default resources 63843 [main] INFO o.a.s.d.worker - Shut down default resources 63843 [main] INFO o.a.s.d.worker - Trigger any worker shutdown hooks 63848 [main] INFO o.a.s.d.worker - Disconnecting from storm cluster state context 63848 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63848 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc043490011 63849 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc043490011 closed 63849 [Thread-10-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63850 [main] INFO o.a.s.d.worker - Shut down worker crawler-1-1489779049 3d22da9e-e57a-4d44-9bb9-eb910686258d 1027 63850 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54097 which had sessionid 0x15addc043490011 63865 [main] INFO o.a.s.config - REMOVE worker-user 50615cfd-191a-4432-8580-a5fcade0034f 63866 [main] INFO o.a.s.d.supervisor - Shut down 3d22da9e-e57a-4d44-9bb9-eb910686258d:50615cfd-191a-4432-8580-a5fcade0034f 63866 [main] INFO o.a.s.d.supervisor - Shutting down supervisor 3d22da9e-e57a-4d44-9bb9-eb910686258d 63867 [Thread-9] INFO o.a.s.event - Event manager interrupted 63867 [Thread-10] INFO o.a.s.event - Event manager interrupted 63867 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 63868 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: 0x15addc04349000c 63869 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x15addc04349000c closed 63869 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 63869 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:54088 which had sessionid 0x15addc04349000c 63869 [main] INFO o.a.s.testing - Shutting down in process zookeeper 63870 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - NIOServerCnxn factory exited run method 63872 [main] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - shutting down 63872 [main] INFO o.a.s.s.o.a.z.s.SessionTrackerImpl - Shutting down 63872 [main] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Shutting down 63872 [main] INFO o.a.s.s.o.a.z.s.SyncRequestProcessor - Shutting down 63872 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - PrepRequestProcessor exited loop! 63872 [SyncThread:0] INFO o.a.s.s.o.a.z.s.SyncRequestProcessor - SyncRequestProcessor exited! 63872 [main] INFO o.a.s.s.o.a.z.s.FinalRequestProcessor - shutdown of request processor complete 63872 [main] INFO o.a.s.testing - Done shutting down in process zookeeper 63872 [main] INFO o.a.s.testing - Deleting temporary path /var/folders/ly/fkgkdh_959g_p_m85pvtnnjw0000gn/T//3fc6734c-9fce-4b0a-b533-5484cc04b45d 63877 [main] INFO o.a.s.testing - Deleting temporary path /var/folders/ly/fkgkdh_959g_p_m85pvtnnjw0000gn/T//c0a2c454-3249-48dd-853a-183e05ca4736 63878 [main] INFO o.a.s.testing - Deleting temporary path /var/folders/ly/fkgkdh_959g_p_m85pvtnnjw0000gn/T//2fe8476d-0fb3-47ef-a9da-dc8dcb48a079 63880 [main] INFO o.a.s.testing - Deleting temporary path /var/folders/ly/fkgkdh_959g_p_m85pvtnnjw0000gn/T//54df0f86-cfce-4a5b-b73b-6a6bcfed3570 64035 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64035 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.kibrisgazetesi.com/ekonomi/seyrusefer-kiyagi-luks-araclara-yarayacak/14657 with status 200 in msec 4604 64175 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64175 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.haberx.com/arsiv(20,a,2016-06-07,193).aspx with status 200 in msec 6642 64183 [FetcherThread] ERROR c.d.s.b.FetcherBolt - Exception while fetching http://www.cnnturk.com/ajanda/tahsin-yarali-kurtuldu-cesur-ve-guzel-19-yeni-bolum-fragmani-son-bolumun-ardindan-yayinlanacak org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at com.digitalpebble.stormcrawler.protocol.httpclient.HttpProtocol.getProtocolOutput(HttpProtocol.java:148) ~[haberCrawl-1.0-SNAPSHOT.jar:?] at com.digitalpebble.stormcrawler.bolt.FetcherBolt$FetcherThread.run(FetcherBolt.java:493) [haberCrawl-1.0-SNAPSHOT.jar:?] 64299 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64299 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://webtv.radikal.com.tr/spor/ with status 200 in msec 2376 64351 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.objektifhaber.com/ataturk-havalimaninda-patlama-ve-silah-sesleri-1688-foto/ with status 200 in msec 1066 64512 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.pressturk.com/firma/kategori/dogalgaz/49/ with status 200 in msec 108 64640 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.internethaber.com/isimsizler-1-bolum-fragmani-video-galerisi-1762077.htm with status 200 in msec 1324 64678 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.yenicaggazetesi.com.tr/bitliste-bes-minare-42028yy.htm with status 200 in msec 171 64756 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64756 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.cumhuriyet.com.tr/arama/metrob%C3%BCs with status 200 in msec 777 64843 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64843 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.hurriyet.com.tr/itu-mizahfest-ile-guldurecek-40379118 with status 200 in msec 27 64940 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64940 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.motosiklet.net/forum/motosiklet-modelleri/?pp=20&daysprune=-1&prefixid=131_ktm_motorsiklet with status 200 in msec 290 64948 [FetcherThread] WARN c.d.s.p.h.HttpProtocol - HTTP content trimmed to 65536 64948 [FetcherThread] INFO c.d.s.b.FetcherBolt - [Fetcher #3] Fetched http://www.yeniasya.com.tr/etiket/ba%C5%9F%C3%B6rt%C3%BCs%C3%BC with status 200 in msec 152 65504 [SessionTracker] INFO o.a.s.s.o.a.z.s.SessionTrackerImpl - SessionTrackerImpl exited loop! alpullu:haberCrawl alpullu$

`

jnioche commented 7 years ago

last day we updated storm crawler and its changed to to flux and couldnt create uberjar with old way.

The java topology is still in the archetype and you should be able to use it in exactly the same way as before.

I presume you are running it in local mode. Which version of Storm are you on?

63669 [main] INFO o.a.s.l.ThriftAccessLogger - Request ID: 1 access from: principal: operation: killTopology

Looks like a Storm-related issue, would be worth checking whether you have the same problem when running the Java topology. If not the problem could be related to Flux.

Anything relevant in the Nimbus log file?

MyraBaba commented 7 years ago

With same setup I can run old way ( storm jar classfile -conf xxxx. -local ) without any problem.

But when I try flux after few minutes its happenign and killingTopology.

Storm is 1.0.2

Nimbus file shows no error. Clean and smooth

Best.

On 20 Mar 2017, at 12:10, Julien Nioche notifications@github.com wrote:

last day we updated storm crawler and its changed to to flux and couldnt create uberjar with old way.

The java topology is still in the archetype and you should be able to use it in exactly the same way as before.

I presume you are running it in local mode. Which version of Storm are you on?

63669 [main] INFO o.a.s.l.ThriftAccessLogger - Request ID: 1 access from: principal: operation: killTopology

Looks like a Storm-related issue, would be worth checking whether you have the same problem when running the Java topology. If not the problem could be related to Flux.

Anything relevant in the Nimbus log file?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DigitalPebble/storm-crawler/issues/440#issuecomment-287706435, or mute the thread https://github.com/notifications/unsubscribe-auth/AQscn0tHBNUeUZvGk7FjV9_Rvfh6_0hCks5rnkJxgaJpZM4MhBLk.

jnioche commented 7 years ago

[http://user.storm.apache.narkive.com/lA9pc40i/flux]

By default, flux will run local mode topologies for 60 seconds.

try setting a ridiculously large value with -s or use in remote mode, which will give you the benefits of the storm UI, proper log files etc...

Will add a note in the README generated by the archetype.

Thanks for reporting this!

MyraBaba commented 7 years ago

:))))

Thanks

On 20 Mar 2017, at 16:29, Julien Nioche notifications@github.com wrote:

[http://user.storm.apache.narkive.com/lA9pc40i/flux]

By default, flux will run local mode topologies for 60 seconds.

try setting a ridiculously large value with -s or use in remote mode, which will give you the benefits of the storm UI, proper log files etc...

Will add a note in the README generated by the archetype.

Thanks for reporting this!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DigitalPebble/storm-crawler/issues/440#issuecomment-287758704, or mute the thread https://github.com/notifications/unsubscribe-auth/AQscn9fOhCdusQT64KTmwVJLdS000MvIks5rnn8vgaJpZM4MhBLk.