Open jererc opened 10 years ago
@jererc can you please provide the document attributes values in MongoDB (especially modified and delete before the update and after)? I will write a unit test.
Hello
the mongodb documents fields and values and the operation from the log are:
EDIT: I forgot to mention the values have been correctly updated in mongodb
Best regards, Jerome
@jererc I cannot reproduce the issue. Can you please take a look at the test just committed and let me know if it matches your use case.
The problem is the issue is kind of random. We reproduced it 2 times by updating a bunch of mongo documents ({ "$set" : { "deleted" : true , "modified" : { "$date" :
We have been seeing similar issues to this within our application. We are speculating here but does elasticsearch (and this river) support having multiple updates to the same document within a batch?
to expand. We have oplog.rs records of
> db.oplog.rs.find( { "o2._id" : "e8e1d6b7-14fa-47dd-a397-33ca428cc5f5" })
{
ts: Timestamp(1394590962, 3),
h: NumberLong("-5053893563459561533"),
v: 2,
op: "u",
ns: "waggle.OrganizationProfile",
o2: {
_id: "e8e1d6b7-14fa-47dd-a397-33ca428cc5f5"
},
o: {
"$set": {
Campaigns: [ ],
Candidate: null,
CandidateType: 0,
FirstJoined: ISODate("2014-03-12T02:21:50.640Z"),
JoinCampaign: null,
LastJoined: ISODate("2014-03-12T02:21:50.640Z"),
NoCall: false,
"Profile.PersonalDetail": {
FirstName: null,
LastName: null,
Email: "me12345678@findly.com",
Location: {
Address: null,
City: null,
State: null,
Country: null,
PostCode: null,
Coordinates: [
0,
0
]
},
Phone: null,
HighestEducationLevel: null
},
"Profile.PhotoUrl140": "manual_140x140.jpg",
"Profile.PhotoUrl50": "manual_50x50.jpg",
"Profile.PhotoUrl80": "manual_80x80.jpg",
"Profile.ProfessionalDetail": {
Education: [ ],
WorkHistories: [
{
Employer: null,
JobTitle: null,
Description: null,
StartDate: null,
EndDate: null,
IsCurrent: true
}
],
Interests: [
{
_id: 10,
Title: "Internships"
},
{
_id: 19,
Title: "Jobs for Recent Grads"
}
]
},
"Profile.SocialDetail": {
LinkedIn: null,
GooglePlus: null,
Facebook: null
},
ReferringUrl: null,
Source: "",
Version: 2
}
}
}
{
ts: Timestamp(1394590985, 2),
h: NumberLong("4000612255181295220"),
v: 2,
op: "u",
ns: "waggle.OrganizationProfile",
o2: {
_id: "e8e1d6b7-14fa-47dd-a397-33ca428cc5f5"
},
o: {
"$set": {
Campaigns: [ ],
Candidate: null,
CandidateType: 0,
FirstJoined: ISODate("2014-03-12T02:21:50.640Z"),
JoinCampaign: null,
LastJoined: ISODate("2014-03-12T02:21:50.640Z"),
NoCall: false,
"Profile.PersonalDetail": {
FirstName: "Earl",
LastName: "dasdas",
Email: "me12345678@findly.com",
Location: {
Address: null,
City: "Auckland",
State: "Auckland",
Country: "NZ",
PostCode: null,
Coordinates: [
174.76333150000005,
-36.8484597
]
},
Phone: "88899",
HighestEducationLevel: 1
},
"Profile.PhotoUrl140": manual_140x140.jpg",
"Profile.PhotoUrl50": "manual_50x50.jpg",
"Profile.PhotoUrl80": "manual_80x80.jpg",
"Profile.ProfessionalDetail": {
Education: [ ],
WorkHistories: [
{
Employer: "qwerty",
JobTitle: "Senior Software Developer",
Description: null,
StartDate: null,
EndDate: null,
IsCurrent: true
}
],
Interests: [
{
_id: 9,
Title: "Information Technology"
}
]
},
"Profile.SocialDetail": {
LinkedIn: null,
GooglePlus: null,
Facebook: null
},
ReferringUrl: null,
Source: "",
Version: 3
}
}
}
As you hopefully can see the oplog times of these is close together. We are also forcing reads to be from a secondary which might be contributing to the issue?
After processing these records we are able to see that the document in elasticsearch has Version:2 instead of Version:3. (plus is missing all the other data).
Is your MongoDB installation sharded?
Hello
We use ES v0.90.7, mongo v2.4.9, and river-mongodb v1.7.3.
I just noticed some documents from mongo are not correctly updated. Here is a log showing the operations for 2 different mongo documents which "deleted" and "modified" fields have been updated in mongo. The first one (_id 52f264addede831ca7c6a476) is correctly updated in ES while the second one (_id 52f264addede831ca7c6a477) is not.
The ES mapping contains "deleted" : {"type" : "boolean"} and "modified" : {"type" : "date", "format" : "dateOptionalTime"}
Here are the river settings: { "_index" : "_river", "_type" : "media", "_id" : "_meta", "_version" : 1, "exists" : true, "_source" : {"mongodb": {"db": "cloudgate-dev", "options": {"secondary_read_preference": true, "include_fields": ["deleted", "user", "created", "modified", "meta", "assets", "status"]}, "collection": "media", "servers": [{"host": "10.201.12.47", "port": 27017}, {"host": "10.201.12.48", "port": 27017}, {"host": "10.201.12.49", "port": 27017}]}, "type": "mongodb", "index": {"bulk": {"concurrent_requests": 1}, "type": "media", "name": "media"}} }
The log (sorry if it's too long):
[2014-02-05 17:20:44,770][DEBUG][org.elasticsearch.river.mongodb.Slurper] collection: media [2014-02-05 17:20:44,770][DEBUG][org.elasticsearch.river.mongodb.Slurper] oplog entry - namespace [cloudgate-dev.media], operation [UPDATE] [2014-02-05 17:20:44,770][DEBUG][org.elasticsearch.river.mongodb.Slurper] oplog processing item { "ts" : { "$ts" : 1391617244 , "$inc" : 1} , "h" : -9194859396347773319 , "v" : 2 , "op" : "u" , "ns" : "cloudgate-dev.media" , "o2" : { "_id" : { "$oid" : "52f264addede831ca7c6a476"}} , "o" : { "$set" : { "deleted" : true , "modified" : { "$date" : "2014-02-05T16:20:44.766Z"}}}} [2014-02-05 17:20:44,771][DEBUG][org.elasticsearch.river.mongodb.Slurper] Updated item: { "_id" : { "$oid" : "52f264addede831ca7c6a476"}} [2014-02-05 17:20:44,771][DEBUG][org.elasticsearch.river.mongodb.Slurper] addQueryToStream - operation [UPDATE], currentTimestamp [TS time:Wed Feb 05 17:20:44 CET 2014 inc:1], update [{ "_id" : { "$oid" : "52f264addede831ca7c6a476"}}] [2014-02-05 17:20:44,772][DEBUG][org.elasticsearch.river.mongodb.Slurper] addToStream - operation [UPDATE], currentTimestamp [TS time:Wed Feb 05 17:20:44 CET 2014 inc:1], data [{ "_id" : { "$oid" : "52f264addede831ca7c6a476"} , "assets" : [ { "audio_bitrate" : 63777 , "audio_codec" : "MPEG-2 Audio layer 3" , "audio_nbr_channel" : 1 , "audio_samplerate" : 22050 , "checksum" : "a3f8216bcd983fa08367e9f45f29e24b" , "container" : "Flash Video" , "created" : { "$date" : "2014-02-05T16:20:08.973Z"} , "done" : true , "duration" : 17 , "file_extension" : "flv" , "file_size" : 830966 , "global_bitrate" : 368978 , "name" : "source" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a520f12b918c" , "type" : "vod" , "video_aspect" : 1.3333333333333333 , "video_bitrate" : 297209 , "video_codec" : "Sorenson H263" , "video_fps" : 29.97 , "video_fps_mode" : "CFR" , "video_height" : 240 , "video_interlaced" : false , "video_rotation" : 0.0 , "video_width" : 320} , { "audio_bitrate" : 37981 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:21.159Z"} , "done" : true , "duration" : 18 , "file_size" : 1610383 , "global_bitrate" : 712913 , "name" : "mp4_h264_aac_hq" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182e493c4683" , "type" : "vod" , "video_bitrate" : 700000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:13.569Z"} , "done" : true , "file_size" : 36353 , "name" : "jpeg_thumbnail_source" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a5210422cd68" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 480 , "video_width" : 640} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:11.147Z"} , "done" : true , "file_size" : 15040 , "name" : "jpeg_thumbnail_large" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182e344a92c4" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 240 , "video_width" : 320} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:11.104Z"} , "done" : true , "file_size" : 6806 , "name" : "jpeg_thumbnail_medium" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182e1e874567" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 120 , "video_width" : 160} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:10.311Z"} , "done" : true , "file_size" : 2753 , "name" : "jpeg_thumbnail_small" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182dd1807d51" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 60 , "video_width" : 80} , { "audio_bitrate" : 37981 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:16.250Z"} , "done" : true , "duration" : 18 , "file_size" : 921018 , "global_bitrate" : 407733 , "name" : "mp4_h264_aac" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182e493c4684" , "type" : "vod" , "video_bitrate" : 360000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320} , { "audio_bitrate" : 31914 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:19.804Z"} , "done" : true , "duration" : 18 , "file_size" : 502725 , "global_bitrate" : 222555 , "name" : "mp4_h264_aac_ld" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a5210422cd69" , "type" : "vod" , "video_bitrate" : 180000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320}] , "created" : { "$date" : "2014-02-05T16:19:57.469Z"} , "deleted" : true , "meta" : { "title" : "Chat-justin-bieber 3.flv"} , "modified" : { "$date" : "2014-02-05T16:20:44.766Z"} , "status" : "ready" , "user" : { "$oid" : "50aa108ddede83201c00000a"}}], collection [media] [2014-02-05 17:20:44,772][DEBUG][org.elasticsearch.river.mongodb.Indexer] updateBulkRequest for id: [52f264addede831ca7c6a476], operation: [UPDATE] [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Indexer] Operation: UPDATE - index: media - type: media - routing: null - parent: null [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Indexer] Update operation - id: 52f264addede831ca7c6a476 - contains attachment: false [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Slurper] collection: media [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Slurper] oplog entry - namespace [cloudgate-dev.media], operation [UPDATE] [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Slurper] oplog processing item { "ts" : { "$ts" : 1391617244 , "$inc" : 2} , "h" : 8308485158567442109 , "v" : 2 , "op" : "u" , "ns" : "cloudgate-dev.media" , "o2" : { "_id" : { "$oid" : "52f264addede831ca7c6a477"}} , "o" : { "$set" : { "deleted" : true , "modified" : { "$date" : "2014-02-05T16:20:44.766Z"}}}} [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Slurper] Updated item: { "_id" : { "$oid" : "52f264addede831ca7c6a477"}} [2014-02-05 17:20:44,773][DEBUG][org.elasticsearch.river.mongodb.Slurper] addQueryToStream - operation [UPDATE], currentTimestamp [TS time:Wed Feb 05 17:20:44 CET 2014 inc:2], update [{ "_id" : { "$oid" : "52f264addede831ca7c6a477"}}] [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Slurper] addToStream - operation [UPDATE], currentTimestamp [TS time:Wed Feb 05 17:20:44 CET 2014 inc:2], data [{ "_id" : { "$oid" : "52f264addede831ca7c6a477"} , "assets" : [ { "audio_bitrate" : 63777 , "audio_codec" : "MPEG-2 Audio layer 3" , "audio_nbr_channel" : 1 , "audio_samplerate" : 22050 , "checksum" : "a3f8216bcd983fa08367e9f45f29e24b" , "container" : "Flash Video" , "created" : { "$date" : "2014-02-05T16:20:08.960Z"} , "done" : true , "duration" : 17 , "file_extension" : "flv" , "file_size" : 830966 , "global_bitrate" : 368978 , "name" : "source" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a520c3fd0a7f" , "type" : "vod" , "video_aspect" : 1.3333333333333333 , "video_bitrate" : 297209 , "video_codec" : "Sorenson H263" , "video_fps" : 29.97 , "video_fps_mode" : "CFR" , "video_height" : 240 , "video_interlaced" : false , "video_rotation" : 0.0 , "video_width" : 320} , { "audio_bitrate" : 37981 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:16.433Z"} , "done" : true , "duration" : 18 , "file_size" : 1610383 , "global_bitrate" : 712913 , "name" : "mp4_h264_aac_hq" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad5d5a182dd1807d52" , "type" : "vod" , "video_bitrate" : 700000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:13.605Z"} , "done" : true , "file_size" : 36353 , "name" : "jpeg_thumbnail_source" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a521123faf03" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 480 , "video_width" : 640} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:12.782Z"} , "done" : true , "file_size" : 15040 , "name" : "jpeg_thumbnail_large" , "ready" : true , "status" : "ready" , "task_id" : "52f264ad79c7a520f12b918e" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 240 , "video_width" : 320} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:14.368Z"} , "done" : true , "file_size" : 6806 , "name" : "jpeg_thumbnail_medium" , "ready" : true , "status" : "ready" , "task_id" : "52f264ae5d5a182e1e874568" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 120 , "video_width" : 160} , { "container" : "JPEG" , "created" : { "$date" : "2014-02-05T16:20:14.496Z"} , "done" : true , "file_size" : 2753 , "name" : "jpeg_thumbnail_small" , "ready" : true , "status" : "ready" , "task_id" : "52f264ae79c7a520c3fd0a80" , "type" : "thumbnail" , "video_codec" : "MJPEG (Motion JPEG)" , "video_height" : 60 , "video_width" : 80} , { "audio_bitrate" : 37981 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:15.874Z"} , "done" : true , "duration" : 18 , "file_size" : 921018 , "global_bitrate" : 407733 , "name" : "mp4_h264_aac" , "ready" : true , "status" : "ready" , "task_id" : "52f264ae5d5a182e493c4686" , "type" : "vod" , "video_bitrate" : 360000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320} , { "audio_bitrate" : 31914 , "audio_codec" : "AAC LC" , "container" : "MPEG-4" , "created" : { "$date" : "2014-02-05T16:20:22.606Z"} , "done" : true , "duration" : 18 , "file_size" : 502725 , "global_bitrate" : 222555 , "name" : "mp4_h264_aac_ld" , "ready" : true , "status" : "ready" , "task_id" : "52f264ae5d5a182e1e874569" , "type" : "vod" , "video_bitrate" : 180000 , "video_codec" : "AVC" , "video_height" : 240 , "video_width" : 320}] , "created" : { "$date" : "2014-02-05T16:19:57.873Z"} , "deleted" : false , "meta" : { "title" : "Chat-justin-bieber 4.flv"} , "modified" : { "$date" : "2014-02-05T16:20:22.607Z"} , "status" : "ready" , "user" : { "$oid" : "50aa108ddede83201c00000a"}}], collection [media] [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Slurper] collection: media [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Slurper] oplog entry - namespace [cloudgate-dev.media], operation [UPDATE] [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Indexer] updateBulkRequest for id: [52f264addede831ca7c6a477], operation: [UPDATE] [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Indexer] Operation: UPDATE - index: media - type: media - routing: null - parent: null [2014-02-05 17:20:44,775][DEBUG][org.elasticsearch.river.mongodb.Indexer] Update operation - id: 52f264addede831ca7c6a477 - contains attachment: false
Best regards, Jerome