leeadkins / elasticsearch-redis-river

A Redis River for Elastic Search.
MIT License
60 stars 15 forks source link

"error":"NoClassSettingsException[Failed to load class with value [redis]] #3

Closed scalp42 closed 12 years ago

scalp42 commented 12 years ago

Hi,

I installed the plugin using :

root@ubuntu3:/usr/local/elasticsearch/bin# ./plugin -install leeadkins/elasticsearch-redis-river/0.0.4

Output :

-> Installing leeadkins/elasticsearch-redis-river/0.0.4... Trying https://github.com/downloads/leeadkins/elasticsearch-redis-river/elasticsearch-redis-river-0.0.4.zip... Downloading ............................................................DONE Installed redis-river

I then created the river using :

curl -XPUT 'localhost:9200/_river/my_redis_river/_meta' -d '{ "type" : "redis", "redis" : { "host" : "192.168.1.25", "port" : 6379, "key" : "logstash", "mode" : "list", "database" : 0 }, "index" : { "bulk_size" : 100, "bulk_timeout" : 5 } }'

When I try the following query :

curl -XGET '192.168.1.24:9200/_river/my_redis_river/_status'

I get this error :

{"_index":"_river","_type":"my_redis_river","_id":"_status","_version":2,"exists":true, "_source" : {"error":"NoClassSettingsException[Failed to load class with value [redis]]; nested: ClassNotFoundException[redis]; ","node":{"id":"HiS542EuRuSaO327r78CIw","name":"ubuntu2","transport_address":"inet[/192.168.1.26:9300]"}}}

Any idea ?

scalp42 commented 12 years ago

Update, I added the stacktrace from ES logs :

org.elasticsearch.common.settings.NoClassSettingsException: Failed to load class with value [redis] at org.elasticsearch.river.RiverModule.loadTypeModule(RiverModule.java:86) at org.elasticsearch.river.RiverModule.spawnModules(RiverModule.java:57) at org.elasticsearch.common.inject.ModulesBuilder.add(ModulesBuilder.java:44) at org.elasticsearch.river.RiversService.createRiver(RiversService.java:135) at org.elasticsearch.river.RiversService$ApplyRivers$2.onResponse(RiversService.java:270) at org.elasticsearch.river.RiversService$ApplyRivers$2.onResponse(RiversService.java:264) at org.elasticsearch.action.support.TransportAction$ThreadedActionListener$1.run(TransportAction.java:86) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.ClassNotFoundException: redis at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at org.elasticsearch.river.RiverModule.loadTypeModule(RiverModule.java:72) ... 9 more

leeadkins commented 12 years ago

Interesting. Hadn't seen one like this before. Looking around at other rivers, it seems there are two common things that can cause an issue like this. First, if the datastore isn't running, ES may throw this on boot (which I doubt is the issue here). Second, if you install the plugin while ES is running, but try to create the river before restarting the node, it will definitely throw this issue.

Was the current ES instance restarted before the call to create the river was made?

scalp42 commented 12 years ago

Hi @leeadkins,

Thank you for the quick answer.

Indeed, I sarted ES, installed the plugin and just after did the curl to create the river.

I tried to restart ES after, but it was throwing the error. I will try to delete the river and recreate it and let you know.

Greatly appreciated, thanks for your work!

ferhatsb commented 12 years ago

Hi, I also want to add that if you have a cluster which river is not installed to all nodes, you will get a class not found error if created river has been tried to run with a node that river is not installed. KR, Ferhat

scalp42 commented 12 years ago

So it looks like it fixed the problem!

curl -XGET '192.168.1.26:9200/_river/my_redis_river/_status'

Output :

{"_index":"_river","_type":"my_redis_river","_id":"_status","_version":1,"exists":true, "_source" : {"ok":true,"node":{"id":"dzRReEAAQ8GlL3Gfc5Vqsg","name":"ubuntu3","transport_address":"inet[/192.168.1.24:9300]"}}}

scalp42 commented 12 years ago

I'm gonna hack in here if you guys have an idea. The river has been created using this :

curl -XPUT '192.168.1.24:9200/_river/my_redis_river/_meta' -d '{ "type" : "redis", "redis" : { "host" : "192.168.1.25", "port" : 6379, "key" : "data", "mode" : "list", "database" : 0 }, "index" : { "bulk_size" : 10, "bulk_timeout" : 3 } }'

192.168.1.25 is the Redis host and I've checked that everything is correct on the Redis "side" :

root@ubuntu3:~# redis-cli llen data (integer) 379

root@ubuntu3:~# redis-cli redis 127.0.0.1:6379> keys * 1) "data"

I can correctly the _river metadata :

{ state: open settings: { index.number_of_shards: 1 index.number_of_replicas: 1 index.version.created: 190999 } mappings: { my_redis_river: { properties: { redis: { dynamic: true properties: { port: { ignore_malformed: false type: long } host: { type: string } key: { type: string } database: { ignore_malformed: false type: long } mode: { type: string } } } node: { dynamic: true properties: { id: { type: string } name: { type: string } transport_address: { type: string } } } index: { dynamic: true properties: { bulk_size: { ignore_malformed: false type: long } bulk_timeout: { ignore_malformed: false type: long } } } ok: { type: boolean } type: { type: string } } } }

However, the values in the key "data" are not being transfered to ES and I've spent some time figuring it out, but still no clues.

If anyone has an idea, I would greatly appreciate it!

Thanks again guys.

PS : It's definitely not a network issue, or auth issue. I can also put my data into ES directly and see it indexed, just not through the redis river.

leeadkins commented 12 years ago

Could you post an example of one of the items in the data redis key?

scalp42 commented 12 years ago

Thanks a lot for the fast answer.

I sent JSON from /var/log/*.

Here is an output from a simple lindex data 500 :

redis 127.0.0.1:6379> lindex data 500 "{\"@source\":\"file://mbp/var/log/system.log\",\"@type\":\"linux-syslog\",\"@tags\":[],\"@fields\":{},\"@timestamp\":\"2012-09-09T04:58:55.883000Z\",\"@source_host\":\"mbp\",\"@source_path\":\"/var/log/system.log\",\"@message\":\"Sep 8 19:48:30 mbp EVE[438]: Active Application: Google Chrome\"}"

leeadkins commented 12 years ago

Ah. That's probably it. This river, much like the RabbitMQ river, uses the Elasticsearch Bulk API to get data in. If Elasticsearch's built-in Bulk Parser doesn't recognize the input, it's discarded.

A given ES node may have many different indexes and many different objects. Additionally, you may have actions that need to both add and remove data from the index. This Bulk API makes it easy to specify where data should go and how it should be dealt with on the fly, allowing your river to be the workhorse of your ES usage.

The Bulk API's docs can be found here: http://www.elasticsearch.org/guide/reference/api/bulk.html

A few good examples of what it looks like in practice can be found in the RabbitMQ river README too: https://github.com/elasticsearch/elasticsearch-river-rabbitmq/blob/master/README.md

Here is a quick example of a call one could make to redis (via the redis-cli) that would result in data being properly indexed.

LPUSH lk:es:list "{\"index\":{\"_index\":\"analytics\",\"_type\":\"analytic\",\"_id\":1}}\n{\"id\":1,\"age\":25,\"name\":\"Lee Adkins\"}\n"

The first part of that data is the actual action and meta information about it. It's a stringified version of this

{
   index: {
     _index:"analytics",
     _type: "analytic",
     _id: 1
   }
 }

(of course, you'd put your own index name and data type name in there. The _id is optional, but if you already have a unique ID for the data you're indexing, go ahead and put it in there too.)

Notice that there is a new line ("\n") right after it. That's important.

Next comes the raw data. That's just the stringified raw data (like you've already got). Again, notice that there is a newline at the end. This is apparently important to the Bulk parser in Elasticsearch.

If you can get your log data (or whatever future data you have) to be formatted in this way, rivers like this one and the RabbitMQ one should work just fine.

scalp42 commented 12 years ago

Lee,

Thanks a lot for the help again. Indeed, the new line did the trick!

Closing the non-issue,

Best,

Anthony