Closed timcmiller closed 6 years ago
Hi @timcmiller
Would you mind putting together a minimal test case (preferably a real test runnable with "go test") to help us take a look at this?
Thanks! Dom
Hi @domodwyer I'm one of @timcmiller's colleagues and I've put together a gist here where the only external dependency is mgo and if you re-run go test ./...
with a local mongo instance running you'll see what Tim described above will take place. Based on what I see in the database on the failed test case, I've concluded that it's creating a duplicate because the keys of objects
are being switched. As expected unless you explicitly index on the keys in an object Mongo does not consider {objects: {foo: "bar", baz: "bah}}
to be the same as {objects: {baz: "bah", foo:"bar"}}
.
As far as I can tell this order switching is due to the fact that iterating over a map is randomized in Go. I'm pretty sure that we're passing in the objects in the same order so it seems that this change in order might be somewhere in the upsert function (?). If that's the case I would say this is behaving differently than the update with upsert: true
in the mongo shell since passing the same object selector in the same order guarantees that's the order of the selector. I may be way off base here so we'd love your input on this issue.
Thanks!
Hi @ckeyes88
You're totally right - it is indeed likely the random iteration order of maps that is biting you here. This is something we can't really work around in mgo unfortunately (it's part of the Go language specification) and I bet it catches loads of people out. It is particularly important for compound indexes - using a map will randomly match the index field ordering and use the index, and other times randomly won't resulting in slow and inefficient queries 50% of the time for the compound index covering 2 fields, 66% for 3, etc...
I'd suggest swapping your bson.M
(the map type) to a bson.D
(an array) which provides deterministic iteration ordering:
objectsS := bson.D{
{Name: "objects.foo", Value: "bar"},
{Name: "objects.baz", Value: "bang"},
}
This should solve the issue, though it isn't the easiest thing to work with. It might be best for your use case to replace your Objects: map[string]string
with a custom type that implements MarshalJSON()
and SetBSON()
and has deterministic ordering - the backing data structure is obviously dependent on your access patterns and tradeoff choices, but maybe a hash array mapped trie (HAMT) or something similar will work for you?
Best of luck - I'm interested in what your outcome is, please do feel free to open a PR if you think it'll be useful for others!
Dom
@domodwyer Thanks for your help! I hadn't thought about using bson.D
as I'm still getting familiar with all the mgo features. We'll definitely research a solution that fits our use case and follow up with our solution.
Cheers!
I've been working on creating an upsert query that uses a selector which is a
map[string]string
and when working with a selector that has multiple keys about 20% of the time I run my query it ends up creating a new index instead of updating the one that is already in the collection.Here is my code handles the requests and prepares the upsert
And here is my test case I'm running against this code.
Here is the stack trace I get on the failing test.
Again, this only seems to fail about 20% of the time. Not sure what is causing it to sometimes fail.