codelibs / elasticsearch-river-web

Web Crawler for Elasticsearch
Apache License 2.0
234 stars 57 forks source link

Error message when running river-web #115

Open marcshep-scribe opened 8 years ago

marcshep-scribe commented 8 years ago

I am using Debian 8.1, Java 8 { "name" : "search", "cluster_name" : "scribeweb", "version" : { "number" : "2.2.0", "build_hash" : "8ff36d139e16f8720f2947ef62c8167a888992fe", "build_timestamp" : "2016-01-27T13:32:39Z", "build_snapshot" : false, "lucene_version" : "5.4.1" }, "tagline" : "You Know, for Search" }

The command i run is ./bin/riverweb --config-id my_web --cluster-name scribeweb --cleanup with the output that follows None of the configured nodes are available: [{#transport#-1}{172.31.26.218}{172.31.26.218:9200}] I'm fairly new to Elasticsearch and this is the first time i've tried to setup river-web

marevol commented 8 years ago

[{#transport#-1}{172.31.26.218}{172.31.26.218:9200}]

The transport port is 9300, not 9200. Check your riverweb.properties.

marcshep-scribe commented 8 years ago

Thank you I was able to get it working. Another quick question is that the site in question that I'm crawling is dynamic with JS. Does river-web crawl dynamic sites? I'm not seeing the content

marevol commented 8 years ago

Does river-web crawl dynamic sites?

It's under development. I added "web_driver_urls" property as url filter pattern, but it's not tested yet...

marcshep-scribe commented 8 years ago

Does this work with sitemaps as a way to crawl a dynamic site?