nasa-jpl-memex / memex-explorer

Viewers for statistics and dashboarding of Domain Search Engine data
BSD 2-Clause "Simplified" License
121 stars 69 forks source link

How to visualize crawl with Kibana? #686

Open rrgirish opened 9 years ago

rrgirish commented 9 years ago

I tried crawling with a couple of sites using the nutch crawler. It shows that it has crawled ~13000 pages. When i click on the visualize button, the kibana dashboard says i have to configure an index pattern.

"logstash-* " doesn't seem to work.

Does this need log.io to work? What sort of values can i give in the index field to see the visualization?

Is there any documentation for this part?

ahmadia commented 9 years ago

Hi @rrgirish!

Yep, this is an issue on our side! No, log.io isn't the issue, and this isn't particularly well documented.

We're using a version of Nutch that pipes processing through Tika and then indexes the results into ElasticSearch. The name of the crawl is then used as the name of the ElasticSearch index. You can look at the local indices with:

curl http://localhost:9200/_cat/indices (Your web browser should find that as well).

It might not be too much work to do the following:

rrgirish commented 9 years ago

Thanks!. Yup having the visualize link point to the correct index would be less confusing indeed.

brittainhard commented 9 years ago

@ahmadia how do we get around the problem of having to configure the index in Kibana first? Is there a way just to point kibana directly to an index in elasticsearch without having to go through this step?

We need to figure this out.