globalsign / mgo

The MongoDB driver for Go
Other
1.97k stars 230 forks source link

Add switch for `omitempty` tag #131

Closed kassiansun closed 6 years ago

kassiansun commented 6 years ago

Currently we have bson.SetJSONTagFallback to fallback to json tag when no bson tag available, which is pretty useful when using protobuf with mgo, leave generated *.pb.go untouched and get a consistent field name between json & bson.

With the default omitempty tag generated by protobuf, updated struct fields with empty values will not be encoded, leave the database untouched. Adding a switch for ignoring omitempty will gain another benefit that we can keep protobuf performant and make updating operations easier.

domodwyer commented 6 years ago

Hi @kassiansun!

I'm not familiar with your use case (storing protobuf generated structures) - do you have a code example that shows the issue? Sounds like it might be a good idea though.

Dom

kassiansun commented 6 years ago

protobuf generate structs with omitempty set by default, this is hard-coded into protobuf's golang implementation.

When used with bson.SetJSONTagFallback(true), if we pass the struct directly to mgo.Insert, it will insert a empty record with only _id field. Even worse with mgo.Update, which we will not be able to update an existed field to its "empty value", as the empty fields will not be encoded at all.

By setting another switch for ignoring omitempty, we can handle these scenarios more gracefully, do something like:

  mgo.Insert(&pb.Msg{})
  // or
  mgo.Update(nil, &bson.M{
    "$set": &pb.Msg{}
  }) // set "message" fields to "", instead of doing nothing at all

The implementation is simple, but I have no idea of writing a test case for mgo, maybe someone more experienced can supply a better patch than above.

kassiansun commented 6 years ago

@domodwyer The patch above is wrong, here is the correct version: https://github.com/kassiansun/turbo/commit/d9db8306069454f2354fa2882105525ab7f4640b

domodwyer commented 6 years ago

Hi @kassiansun

I'm a little worried it would be counterintuitive and go against the expected behaviour for most users.

https://github.com/grpc/grpc-go/issues/481 seems to be about roughly the same thing, @proteneer seemed to have some success by wrapping the PB types. Perhaps wrapping your pb.Msg in a custom type for DB operations would help.

https://github.com/golang/protobuf/issues/238 seems to suggest that using the standard go JSON encoder isn't supported for protobuf use, so perhaps using the fallback option isn't a good idea regardless?

Dom

kassiansun commented 6 years ago

The default behavior will be consistent when the newly-added switch turned off (default).

The issues you referenced are about serializing the generated *.pb.go into json, but we're serializing them into bson. The new switch will provide more control over the bson behaviors, when we fallback into json tags by setting bson.SetJSONTagFallback(true).

domodwyer commented 6 years ago

Hi @kassiansun

We had a discussion as a team and decided this would be rather confusing for end users - we suggest pursuing the answers in the tickets above.

We really appreciate the PR, I hope you will open more in the future.

Dom

szank commented 6 years ago

Hi @kassiansun. I have just stumbled on this thread on golang-nuts and thought it might be somewhat helpful. https://groups.google.com/forum/#!topic/golang-nuts/VHFjFix7hTE

You would have to redefine all the structs where you want to modify the tags, but it should avoid any memory copying.

kassiansun commented 6 years ago

@szank Cool, I'll take a try on it.

kassiansun commented 6 years ago

@szank Actually this won't work across packages, but still good to know this trick.

ddspog commented 6 years ago

@domodwyer Could bson package have some option to set OmitEmpty for every tag? It could be similar to bson.SetJSONTagFallback(true).

I'm working on a wrapper for mgo to short code as much as possible when using this package. The last barrier I encountered was with these omitempty tags. For the way I like to use this mgo package, all fiels on my Document structs should have this tag. And it's really repetitive code, that could be resumed in one line on my configuration.

I'm asking this here, since I don't know if you guys would consider duplicate of this issue.

ddspog commented 6 years ago

It would solve my problem if Marshall or Unmarshall (I'm not sure how it works) could have optional argument to set this default UseOmitEmptyOnEveryField to true. Like...

func Marshal(in interface{}, opts ...MarshallOptions) (out []byte, err error) {
    return MarshalBuffer(in, make([]byte, 0, initialBufferSize, opts...))
}

Then I would call

if buf, err = bson.Marshal(in, bson.MarshallOptions{DefaultOmitEmpty: true}); err == nil {
    // do stuff with buf, like bson.Unmarshall to a bson.M object
}

I think this option can be more controllable for users of this package.

domodwyer commented 6 years ago

@ddspog

The idea of passing an options struct is cool - there's lots of different users of the BSON marshaller and lots of conflicting opinions on how it should work! It would have to be on a different function to maintain backwards compatibility (can't change the definition of Marshal() unfortunately).

That said I'm still not sure if pushing more options and logic into the BSON package so users can avoid tagging their structs is really worth it, it is already getting a bit complicated in terms of "switch this feature on, that off" etc - is there a reason you don't want to add bson:",omitempty" on the fields?

Dom

kassiansun commented 6 years ago

Hi @domodwyer , bson has a different usage purpose from encoding/json, it's more like a internal package. encoding/json can provide a Marshaller for customization, but bson can't, adding more package-level options is a must-have for bson.

But we still can refactor the current design into a Marshaller style, and provide package-level options config for a "default marshaller", like log and other packages, though bson.Marshaller may be not very useful for end users.

domodwyer commented 6 years ago

@kassiansun the BSON package is not internal, it's exported and used by others outside of mgo - function definitions can't change, sorry.

ddspog commented 6 years ago

Well, when trying to use collection.Find(document), if document is a struct or a marshaled bson.M version, it won't work nicely (in my case), unless all empty fields are omitempty.

If I use User as a document, for example:

type User struct {
    IDV        bson.ObjectId `bson:"_id,omitempty"`
    CreatedOnV int64          `bson:"created_on,omitempty"`
    UpdatedOnV int64          `bson:"updated_on,omitempty"`
    UsernameV  string         `form:"username" binding:"required" bson:"username,omitempty"`
    EmailV     string         `form:"email" bson:"email,omitempty"`
    PasswordV  string         `form:"password" binding:"required" bson:"password,omitempty"`
    RoleV      byte           `bson:"role,omitempty"`
}

When searching using ID (to access a page author info), Username (on login) or Role (who have admin priority?), omitempty is essencial to Marshal process. If any isn't omitempty, like Email, it would make collection.Find(User{UsernameV: "myusername"} return 'not found' error.

So, if I could set the Marshal method to work with OmitEmpt set by default, using bson.SetDefaultOmitEmpty(true) for example, I would be able to work with:

type User struct {
    IDV        bson.ObjectId `bson:"_id"`
    CreatedOnV int64          `bson:"created_on"`
    UpdatedOnV int64          `bson:"updated_on"`
    UsernameV  string         `form:"username" binding:"required" bson:"username"`
    EmailV     string         `form:"email" bson:"email"`
    PasswordV  string         `form:"password" binding:"required" bson:"password"`
    RoleV      byte           `bson:"role"`
}

Which is a more readable structure, easier to maintain. If another developer added a needed field, it wouldn't have to remember to add omitempty on my code, and such.

I do understand why bson.Marshal can't change its function definitions, but it would be bad to create a bson.SetDefaultOmitEmpty function (or other better name)?

ddspog commented 6 years ago

@domodwyer I've tried my own solution. Made a test using my fork, altered bson/compatibility.go to add this.

// Current state of the Default OmitEmpty option.
var useDefaultOmitEmpty = false

// SetDefaultOmitEmpty enables or disables the use of OmitEmpty on every tag, with or without this flag. It will
// allows BSON Marshall to eliminate the empty tags from final result.
func SetDefaultOmitEmpty(state bool) {
    useDefaultOmitEmpty = state
}

// DefaultOmitEmptyState returns the current status of Default OmitEmpty option. See SetDefaultOmitEmpty for more
// information.
func DefaultOmitEmptyState() bool {
    return useDefaultOmitEmpty
}

And then on bson/encode.go, at func (e *encoder) addStruct(reflect.Value) body, I've change the condition related to OmitEmpty to this:

if (info.OmitEmpty || DefaultOmitEmptyState()) && isZero(value) {
    continue
}

The result is that if I set it to True before translating a object to an M like this:

bson.SetDefaultOmitEmpty(true)

if buf, err = bson.Marshal(in); err == nil {
    if err = bson.Unmarshal(buf, &target); err == nil {
        out = target.(bson.M)
    }
}

Result will come as expected. The problem for me did come when working with collection.RemoveAll(). It was accusing an QueryError of missing limit field. As I investigated, it was my DefaultOmitEmpty that was causing the problem because RemoveAll() set an deleteOp{Limit: 0} which cause "bson:limit" field, to understand that as a empty field, so it omitted, causing the error.

Of course, in my case, the obvious solution to set bson.SetDefaultOmitEmpty(false) after making my Marshal and Unmarshal operations worked on every test that I make. But I would like an opinion on you guys on that matter, before asking for an PR with this change.

kassiansun commented 6 years ago

Hi, @ddspog this is a different problem, we can't change any bson behavior as this will affect the internal protocol of MongoDB. So if you want to patch this, patch the json tag behavior, and use bson.SetJSONTagFallback.

ddspog commented 6 years ago

@kassiansun How could I patch the json tag behaviour? Can I add a mgo.SetJSONDefaultOmitEmpty(true)? Because I can't simply import the internal package.

kassiansun commented 6 years ago

@ddspog You need to fork and patch it. You can detect whether the tag is from bson or json tag at here: https://github.com/globalsign/mgo/blob/master/bson/bson.go#L703 There is my previous patch to omit omitempty, you can simply change it to force omitempty by setting info.OmitEmpty at here: https://github.com/kassiansun/turbo/commit/d9db8306069454f2354fa2882105525ab7f4640b#diff-87630d60b5b20b5644f728b59ec6bd04R702

ddspog commented 6 years ago

I've resolved to copy my version of bson.Marshal and bson.UNmarshal method to an internal package.