Closed kassiansun closed 6 years ago
Hi @kassiansun!
I'm not familiar with your use case (storing protobuf generated structures) - do you have a code example that shows the issue? Sounds like it might be a good idea though.
Dom
protobuf
generate structs with omitempty
set by default, this is hard-coded into protobuf's golang implementation.
When used with bson.SetJSONTagFallback(true)
, if we pass the struct directly to mgo.Insert
, it will insert a empty record with only _id
field. Even worse with mgo.Update
, which we will not be able to update an existed field to its "empty value", as the empty fields will not be encoded at all.
By setting another switch for ignoring omitempty
, we can handle these scenarios more gracefully, do something like:
mgo.Insert(&pb.Msg{})
// or
mgo.Update(nil, &bson.M{
"$set": &pb.Msg{}
}) // set "message" fields to "", instead of doing nothing at all
The implementation is simple, but I have no idea of writing a test case for mgo
, maybe someone more experienced can supply a better patch than above.
@domodwyer The patch above is wrong, here is the correct version: https://github.com/kassiansun/turbo/commit/d9db8306069454f2354fa2882105525ab7f4640b
Hi @kassiansun
I'm a little worried it would be counterintuitive and go against the expected behaviour for most users.
https://github.com/grpc/grpc-go/issues/481 seems to be about roughly the same thing, @proteneer seemed to have some success by wrapping the PB types. Perhaps wrapping your pb.Msg
in a custom type for DB operations would help.
https://github.com/golang/protobuf/issues/238 seems to suggest that using the standard go JSON encoder isn't supported for protobuf use, so perhaps using the fallback option isn't a good idea regardless?
Dom
The default behavior will be consistent when the newly-added switch turned off (default).
The issues you referenced are about serializing the generated *.pb.go
into json, but we're serializing them into bson
. The new switch will provide more control over the bson behaviors, when we fallback into json tags by setting bson.SetJSONTagFallback(true)
.
Hi @kassiansun
We had a discussion as a team and decided this would be rather confusing for end users - we suggest pursuing the answers in the tickets above.
We really appreciate the PR, I hope you will open more in the future.
Dom
Hi @kassiansun. I have just stumbled on this thread on golang-nuts and thought it might be somewhat helpful. https://groups.google.com/forum/#!topic/golang-nuts/VHFjFix7hTE
You would have to redefine all the structs where you want to modify the tags, but it should avoid any memory copying.
@szank Cool, I'll take a try on it.
@szank Actually this won't work across packages, but still good to know this trick.
@domodwyer Could bson package have some option to set OmitEmpty for every tag? It could be similar to bson.SetJSONTagFallback(true).
I'm working on a wrapper for mgo to short code as much as possible when using this package. The last barrier I encountered was with these omitempty tags. For the way I like to use this mgo package, all fiels on my Document structs should have this tag. And it's really repetitive code, that could be resumed in one line on my configuration.
I'm asking this here, since I don't know if you guys would consider duplicate of this issue.
It would solve my problem if Marshall or Unmarshall (I'm not sure how it works) could have optional argument to set this default UseOmitEmptyOnEveryField to true. Like...
func Marshal(in interface{}, opts ...MarshallOptions) (out []byte, err error) {
return MarshalBuffer(in, make([]byte, 0, initialBufferSize, opts...))
}
Then I would call
if buf, err = bson.Marshal(in, bson.MarshallOptions{DefaultOmitEmpty: true}); err == nil {
// do stuff with buf, like bson.Unmarshall to a bson.M object
}
I think this option can be more controllable for users of this package.
@ddspog
The idea of passing an options struct is cool - there's lots of different users of the BSON marshaller and lots of conflicting opinions on how it should work! It would have to be on a different function to maintain backwards compatibility (can't change the definition of Marshal()
unfortunately).
That said I'm still not sure if pushing more options and logic into the BSON package so users can avoid tagging their structs is really worth it, it is already getting a bit complicated in terms of "switch this feature on, that off" etc - is there a reason you don't want to add bson:",omitempty"
on the fields?
Dom
Hi @domodwyer , bson
has a different usage purpose from encoding/json
, it's more like a internal package. encoding/json
can provide a Marshaller
for customization, but bson
can't, adding more package-level options is a must-have for bson
.
But we still can refactor the current design into a Marshaller
style, and provide package-level options config for a "default marshaller", like log
and other packages, though bson.Marshaller
may be not very useful for end users.
@kassiansun the BSON package is not internal, it's exported and used by others outside of mgo
- function definitions can't change, sorry.
Well, when trying to use collection.Find(document), if document is a struct or a marshaled bson.M version, it won't work nicely (in my case), unless all empty fields are omitempty.
If I use User as a document, for example:
type User struct {
IDV bson.ObjectId `bson:"_id,omitempty"`
CreatedOnV int64 `bson:"created_on,omitempty"`
UpdatedOnV int64 `bson:"updated_on,omitempty"`
UsernameV string `form:"username" binding:"required" bson:"username,omitempty"`
EmailV string `form:"email" bson:"email,omitempty"`
PasswordV string `form:"password" binding:"required" bson:"password,omitempty"`
RoleV byte `bson:"role,omitempty"`
}
When searching using ID (to access a page author info), Username (on login) or Role (who have admin priority?), omitempty is essencial to Marshal process. If any isn't omitempty, like Email, it would make collection.Find(User{UsernameV: "myusername"} return 'not found' error.
So, if I could set the Marshal method to work with OmitEmpt set by default, using bson.SetDefaultOmitEmpty(true) for example, I would be able to work with:
type User struct {
IDV bson.ObjectId `bson:"_id"`
CreatedOnV int64 `bson:"created_on"`
UpdatedOnV int64 `bson:"updated_on"`
UsernameV string `form:"username" binding:"required" bson:"username"`
EmailV string `form:"email" bson:"email"`
PasswordV string `form:"password" binding:"required" bson:"password"`
RoleV byte `bson:"role"`
}
Which is a more readable structure, easier to maintain. If another developer added a needed field, it wouldn't have to remember to add omitempty on my code, and such.
I do understand why bson.Marshal can't change its function definitions, but it would be bad to create a bson.SetDefaultOmitEmpty function (or other better name)?
@domodwyer I've tried my own solution. Made a test using my fork, altered bson/compatibility.go to add this.
// Current state of the Default OmitEmpty option.
var useDefaultOmitEmpty = false
// SetDefaultOmitEmpty enables or disables the use of OmitEmpty on every tag, with or without this flag. It will
// allows BSON Marshall to eliminate the empty tags from final result.
func SetDefaultOmitEmpty(state bool) {
useDefaultOmitEmpty = state
}
// DefaultOmitEmptyState returns the current status of Default OmitEmpty option. See SetDefaultOmitEmpty for more
// information.
func DefaultOmitEmptyState() bool {
return useDefaultOmitEmpty
}
And then on bson/encode.go, at func (e *encoder) addStruct(reflect.Value) body, I've change the condition related to OmitEmpty to this:
if (info.OmitEmpty || DefaultOmitEmptyState()) && isZero(value) {
continue
}
The result is that if I set it to True before translating a object to an M like this:
bson.SetDefaultOmitEmpty(true)
if buf, err = bson.Marshal(in); err == nil {
if err = bson.Unmarshal(buf, &target); err == nil {
out = target.(bson.M)
}
}
Result will come as expected. The problem for me did come when working with collection.RemoveAll(). It was accusing an QueryError of missing limit field. As I investigated, it was my DefaultOmitEmpty that was causing the problem because RemoveAll() set an deleteOp{Limit: 0} which cause "bson:limit" field, to understand that as a empty field, so it omitted, causing the error.
Of course, in my case, the obvious solution to set bson.SetDefaultOmitEmpty(false) after making my Marshal and Unmarshal operations worked on every test that I make. But I would like an opinion on you guys on that matter, before asking for an PR with this change.
Hi, @ddspog this is a different problem, we can't change any bson
behavior as this will affect the internal protocol of MongoDB. So if you want to patch this, patch the json
tag behavior, and use bson.SetJSONTagFallback
.
@kassiansun How could I patch the json
tag behaviour? Can I add a mgo.SetJSONDefaultOmitEmpty(true)? Because I can't simply import the internal package.
@ddspog You need to fork and patch it. You can detect whether the tag is from bson
or json
tag at here:
https://github.com/globalsign/mgo/blob/master/bson/bson.go#L703
There is my previous patch to omit omitempty
, you can simply change it to force omitempty
by setting info.OmitEmpty
at here:
https://github.com/kassiansun/turbo/commit/d9db8306069454f2354fa2882105525ab7f4640b#diff-87630d60b5b20b5644f728b59ec6bd04R702
I've resolved to copy my version of bson.Marshal and bson.UNmarshal method to an internal package.
Currently we have bson.SetJSONTagFallback to fallback to json tag when no bson tag available, which is pretty useful when using protobuf with mgo, leave generated *.pb.go untouched and get a consistent field name between json & bson.
With the default omitempty tag generated by protobuf, updated struct fields with empty values will not be encoded, leave the database untouched. Adding a switch for ignoring omitempty will gain another benefit that we can keep protobuf performant and make updating operations easier.