richardwilly98 / elasticsearch-river-mongodb

MongoDB River Plugin for ElasticSearch
1.12k stars 215 forks source link

River isn't indexing from mongoDB #546

Open akluffy opened 9 years ago

akluffy commented 9 years ago

HI,

I tried to get it work a month ago in a single AWS EC2 on which installed MongoDB Version: 3.0.4 and ElasticSearch Version: 1.6.0, River Version: 2.0.9.

However, when I tried to deploy MongoDB and ElasticSearch on different AWS EC2 instances, it doesn't work any more! MongoDB Version: 3.0.4 ElasticSearch Version: 1.6.0 River Version: 2.0.9 Mapper Attachments Type Version: 2.7.0 and 2.6.0 (Tried both)

Note: The connection is very good. I used the command "mongo 52.24.225.108@mydatabase", it works just fine!!

The river config is below: screen shot 2015-07-11 at 1 02 28 am

akluffy commented 9 years ago

For test and reproduce bug purpose, I did another test. So please just forget about the configure pic above.

Case1: Installed mongodb and elasticsearch on the same machine, say M1 (AWS EC2 Ubuntu). M1's IP is 52.27.8.35. In this case, it works just fine! Case2: Installed Elasticsearch on a different machine, say M2. Trying to create a river from M2(Elasticsearch) to M1(MongoDB). Nah, this case doesn't work.

Elasticsearch's configuration is the same: curl -XPUT localhost:9200/_river/test/_meta -d '{ "type": "mongodb", "mongodb": { "servers": [ { "host": "52.27.8.35", "port": 27017 } ], "db": "test", "collection": "random", "options": { "secondary_read_preference": true }, "gridfs": false }, "index": { "name": "test", "type": "random" } }' screen shot 2015-07-11 at 11 02 23 pm

Let me show two different logs here.

First, this is the log for the case 1: [2015-07-12 05:26:34,448][INFO ][cluster.metadata ] [Man-Beast] [_river] creating index, cause [auto(index api)], templates [], shards [1]/[1], mappings [test] [2015-07-12 05:26:34,490][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test [2015-07-12 05:26:34,491][INFO ][river ] [Man-Beast] rivers have been deprecated. Read https://www.elastic.co/blog/deprecating_rivers [2015-07-12 05:26:34,492][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z] [2015-07-12 05:26:34,492][INFO ][river.mongodb.util ] setRiverStatus called with test - RUNNING [2015-07-12 05:26:34,493][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River test startup pending [2015-07-12 05:26:34,495][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river test [2015-07-12 05:26:34,496][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [test], collection [random], script [null], indexing to [test]/[random] [2015-07-12 05:26:34,512][INFO ][cluster.metadata ] [Man-Beast] [test] creating index, cause [api], templates [], shards [5]/[1], mappings [] [2015-07-12 05:26:34,577][INFO ][org.elasticsearch.river.mongodb.MongoConfigProvider] MongoDB version - 3.0.4 [2015-07-12 05:26:34,600][INFO ][org.elasticsearch.river.mongodb.CollectionSlurper] MongoDBRiver is beginning initial import of test.random [2015-07-12 05:26:34,601][INFO ][org.elasticsearch.river.mongodb.CollectionSlurper] Number of documents indexed in initial import of test.random: 55 [2015-07-12 05:26:34,643][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test [2015-07-12 05:26:34,643][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,660][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,672][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,685][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test

[2015-07-12 05:26:35,101][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Started river test

And, this is the log for the case 2: [2015-07-12 05:40:45,126][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] creating index, cause [auto(index api)], templates [], shards [1]/[1], mappings [test] [2015-07-12 05:40:45,200][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test [2015-07-12 05:40:45,201][INFO ][river ] [Drax the Destroyer] rivers have been deprecated. Read https://www.elastic.co/blog/deprecating_rivers [2015-07-12 05:40:45,202][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z] [2015-07-12 05:40:45,202][INFO ][river.mongodb.util ] setRiverStatus called with test - RUNNING [2015-07-12 05:40:45,208][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River test startup pending [2015-07-12 05:40:45,211][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test [2015-07-12 05:40:45,216][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river test [2015-07-12 05:40:45,217][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [test], collection [random], script [null], indexing to [test]/[random] [2015-07-12 05:40:45,237][INFO ][cluster.metadata ] [Drax the Destroyer] [test] creating index, cause [api], templates [], shards [5]/[1], mappings [] [2015-07-12 05:40:45,354][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test

Note: Connection is perfect

You can try typing the command: mongo 52.27.8.35/test What makes me really confused is that the it does work on the same machine but will not work on distributed systems. Why? Version's problem??

twistedfategit commented 9 years ago

see here https://github.com/richardwilly98/elasticsearch-river-mongodb/issues/548#issuecomment-122620187

hzm1029 commented 9 years ago

I just make it run on Centos .ES version:1.6.0,mongodb version:3.0.2 Remove the line "options":{...} Maybe secondary node has no oplog

akluffy commented 9 years ago

@hzm1029 Still doesn't work after deleting "options"

themez commented 9 years ago

I run into this problem also: elasticsearch 1.4.2, mongodb 3.0, river plugin 2.0.9

I've tried downgrade mongodb/river plugin version, but none of that works..

It's seems stopped at connnecting mongodb, but mongo host:port/testmongo connection is good. Here's my log

[2015-08-12 03:40:43,067][INFO ][node                     ] [Doctor Leery] version[1.4.2], pid[1], build[927caff/2014-12-16T14:11:12Z]
[2015-08-12 03:40:43,068][INFO ][node                     ] [Doctor Leery] initializing ...
[2015-08-12 03:40:43,113][INFO ][plugins                  ] [Doctor Leery] loaded [mapper-attachments, mongodb-river], sites [river-mongodb]
[2015-08-12 03:40:45,388][INFO ][node                     ] [Doctor Leery] initialized
[2015-08-12 03:40:45,392][INFO ][node                     ] [Doctor Leery] starting ...
[2015-08-12 03:40:45,498][INFO ][transport                ] [Doctor Leery] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.17.0.48:9300]}
[2015-08-12 03:40:45,524][INFO ][discovery                ] [Doctor Leery] elasticsearch/WU5DqE_HRqGwNI1mpUwtWQ
[2015-08-12 03:40:49,292][INFO ][cluster.service          ] [Doctor Leery] new_master [Doctor Leery][WU5DqE_HRqGwNI1mpUwtWQ][ca838da745f2][inet[/172.17.0.48:9300]], reason: zen-disco-join (elected_as_master)
[2015-08-12 03:40:49,314][INFO ][http                     ] [Doctor Leery] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.48:9200]}
[2015-08-12 03:40:49,314][INFO ][node                     ] [Doctor Leery] started
[2015-08-12 03:40:49,318][INFO ][gateway                  ] [Doctor Leery] recovered [0] indices into cluster_state
[2015-08-12 03:41:25,830][INFO ][cluster.metadata         ] [Doctor Leery] [_river] creating index, cause [auto(index api)], shards [1]/[1], mappings [mongodb]
[2015-08-12 03:41:26,147][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,171][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z]
[2015-08-12 03:41:27,182][INFO ][river.mongodb.util       ] setRiverStatus called with mongodb - RUNNING
[2015-08-12 03:41:27,185][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,190][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River mongodb startup pending
[2015-08-12 03:41:27,217][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,220][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river mongodb
[2015-08-12 03:41:27,220][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [false], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [testmongo], collection [person], script [null], indexing to [mongoindex]/[person]
[2015-08-12 03:41:27,252][INFO ][cluster.metadata         ] [Doctor Leery] [mongoindex] creating index, cause [api], shards [5]/[1], mappings []
[2015-08-12 03:41:27,377][INFO ][river.mongodb            ] [Doctor Leery] Creating MongoClient for [[123.59.43.117:27017]]

617f9277-72da-4194-8bdf-973a18286e93

themez commented 9 years ago

I figured out the problem is mongoClient is not correctly connected,

because my replica set config is like this:

{
    "_id" : "rs0",
    "version" : 2,
    "members" : [
        {
            "_id" : 0,
            "host" : "dev:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : 0,
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatTimeoutSecs" : 10,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        }
    }
}

the hostname dev cannot be resolved by elasticsearch machine, I reconfig the replica member host then it works fine.

@akluffy in your case 2, you install elasticsearch on a different machine, maybe you had the same problem as mine?

akluffy commented 9 years ago

Yeah it should be the same problem

Sent from my iPhone

On Aug 12, 2015, at 12:34 AM, ThemeZ notifications@github.com wrote:

I figured out the problem is mongoClient is not correctly connected,

because my replica set config is like this:

{ "_id" : "rs0", "version" : 2, "members" : [ { "_id" : 0, "host" : "dev:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : {

        },
        "slaveDelay" : 0,
        "votes" : 1
    }
],
"settings" : {
    "chainingAllowed" : true,
    "heartbeatTimeoutSecs" : 10,
    "getLastErrorModes" : {

    },
    "getLastErrorDefaults" : {
        "w" : 1,
        "wtimeout" : 0
    }
}

} the hostname dev cannot be resolved by elasticsearch machine, I reconfig the replica member host then it works fine.

@akluffy in your case 2, you install elasticsearch on a different machine, maybe you had the same problem as mine?

— Reply to this email directly or view it on GitHub.