richardwilly98 / elasticsearch-river-mongodb

MongoDB River Plugin for ElasticSearch
1.12k stars 215 forks source link

Fix NPE when we cannot get the _id field #502

Closed smecsia closed 1 year ago

smecsia commented 9 years ago

Sometimes _id field occasionally may be null. This leads to the total river failure. I believe it should be steady to any data-related problems.

benmccann commented 9 years ago

I thought _id was always required in MongoDB? Can you reproduce this? I'd be very curious what's happening here

smecsia commented 9 years ago

Unfortunately I did not have much time to investigate what was the source of the problem. But it happened to me not once. In general Mongo allows to insert documents without the _id. I did not do something special actually, so I have no idea why it happened.

smecsia commented 9 years ago

I was able to reproduce problem with river version 2.0.7. MongoDB (both 2.x and 3.x) allows to insert documents with _id equals to null:

        db.somecollection.insert({"_id": null,"name":"John"});

In this case the river fails to fetch the data and goes to the unexpected state with exception:

2015-03-29 01:15:33,243 [iver_startup:mongolastic][T#1]] ERROR CollectionSlurper              - Exception while looping in cursor
java.lang.NullPointerException
    at org.elasticsearch.river.mongodb.CollectionSlurper.addInsertToStream(CollectionSlurper.java:238)[elasticsearch-river-mongodb-2.0.7.jar:2.0.7]
    at org.elasticsearch.river.mongodb.CollectionSlurper.importCollection(CollectionSlurper.java:144)[elasticsearch-river-mongodb-2.0.7.jar:2.0.7]
    at org.elasticsearch.river.mongodb.CollectionSlurper.importInitial(CollectionSlurper.java:72)[elasticsearch-river-mongodb-2.0.7.jar:2.0.7]
    at org.elasticsearch.river.mongodb.MongoDBRiver$1.run(MongoDBRiver.java:305)[elasticsearch-river-mongodb-2.0.7.jar:2.0.7]
    at java.lang.Thread.run(Thread.java:745)[:1.7.0_76]

This PR resolves the problem.

clslrns commented 9 years ago

I ran into the same issue, but in OplogSlurper:

[2015-04-13 11:03:31,891][ERROR][org.elasticsearch.river.mongodb.OplogSlurper] Exception while looping in cursor
java.lang.NullPointerException
        at org.elasticsearch.river.mongodb.OplogSlurper.addInsertToStream(OplogSlurper.java:528)
        at org.elasticsearch.river.mongodb.OplogSlurper.processOplogEntry(OplogSlurper.java:271)
        at org.elasticsearch.river.mongodb.OplogSlurper.run(OplogSlurper.java:109)
        at java.lang.Thread.run(Thread.java:745)

@smecsia, would be great to apply your fix to the OplogSlurper, too :beer:

smecsia commented 9 years ago

@clslrns done. @benmccann Do you have any plans on merging this PR or fixing the described problem in the near future?

benmccann commented 9 years ago

Sorry for the delay @smecsia and thanks for checking in. I will try to review this soon

azee commented 9 years ago

+1