apache / incubator-stormcrawler

A scalable, mature and versatile web crawler based on Apache Storm
https://stormcrawler.apache.org/
Apache License 2.0
887 stars 262 forks source link

Receiving org.apache.storm.utils.NimbusLeaderNotFoundException as executed CrawlTopology #417

Closed isspek closed 7 years ago

isspek commented 7 years ago

I created new StormCrawler-based project by following steps described in I. Without modifying crawler-confer.yaml, I executed CrawlTopology.java which comes with the project and then I received error as below:

org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [localhost]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
    at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:90)
    at org.apache.storm.StormSubmitter.topologyNameExists(StormSubmitter.java:371)
    at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:233)
    at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:311)
    at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:157)
    at com.digitalpebble.stormcrawler.ConfigurableTopology.submit(ConfigurableTopology.java:85)
        at com.mycompany.crawler.CrawlTopology.run(CrawlTopology.java:68)
    at com.digitalpebble.stormcrawler.ConfigurableTopology.start(ConfigurableTopology.java:50)
    at com.mycompany.crawler.CrawlTopology.main(CrawlTopology.java:38)

I searched Google what the problem causes, it seems related Zookeper. But I am using Tomcat server. What should I do for preventing this error?

sebastian-nagel commented 7 years ago

How was the CrawlTopology executed? Normally, the execution is done by Storm:

storm jar .../path/to/storm-crawler.jar com.digitalpebble.stormcrawler.CrawlTopology -conf ...
isspek commented 7 years ago

@sebastian-nagel I created it as Maven Project in Eclipse. I wanted to figure out how crawler works. So I executed main method by running as application in Eclipse and then I got these errors. I haven't installed Storm on my machine, because it is already in pom. Should I install it on my machine?

jnioche commented 7 years ago

Add -local and conf crawler-conf.yaml as argument to the topology class. You don't have to run it with Storm, it works with Eclipse. If local is not specified, it tries to connect to a Storm cluster and since you haven't installed one you are getting this error

sebastian-nagel commented 7 years ago

The wiki contains a description how to run the topology locally.

jnioche commented 7 years ago

@sebastian-nagel that wiki page needs fixing. Running with mvn-exec does not work, see #324

The README file generated by the archetype contains the correct instructions i.e run it with Storm installed (but it also works with Eclipse and is a good way of debugging)