Open jhump opened 5 years ago
Related to #913
Not only a rollback, a roll forward becomes a nightmare when one has more than one application using the same datastore (like in microservices). Adding a new field will break all of them, regardless of how "related" they are to the change, essentially requiring an update of all services at all times.
For those curious, here's a patch I am currently using to work-around this issue.
However, this patch almost certainly does not work as expected/desired with entity values. This patch is heavily used in production, but only where all nested structs use flattened properties (a hold-over from our app once running in App Engine, whose Datastore APIs did not support entity values).
I'm hoping to get some time to adapt this patch to work with entity values, since I'd like our app to start taking advantage of them (we no longer run anything on App Engine, so now the only constraint keeping us on flattened properties is this issue and this imperfect patch).
@jhump Thanks for the patch!
I've been watching this issue (and the other related ones such as #913 and #3146) for years, wanting to use Go with Cloud Datastore but being blocked by safe rollback/roll-forward. I'm surprised (and concerned) nothing is happening on this topic. Since this seems crucial to be able to use this library in any serious project, I wonder: am I missing why this isn't blocking most people? does everyone using datastore have their own custom fork of the Google Cloud Go libraries to do safe rollback/roll-forward without data loss? Is datastore on the way out, and no longer being invested in? I hate to be that guy that posts the overly dramatic reply, but I'm starting to get quite desperate to get some insight into this issue, just so I can set my expectations whether Go + Cloud Datastore is still worth an investment.
Hi @remko — apologies for the delay here, but just to address a part of your comment, the Datastore Go library is actively developed and new versions are released every ~6 months or so: https://github.com/googleapis/google-cloud-go/releases?q=datastore&expanded=true I think the write-up of this issue in https://github.com/googleapis/google-cloud-go/issues/3146 has more use case detail, and I'm tempted to close this one out and re-open that one (but cross-reference for the "rollback" use case). Thoughts?
@meredithslota Sounds good to me
The behavior of
LoadStruct
is to report an error if the datastore row has any unrecognized properties. This is particularly problematic since it means that roll backs of a service that uses datastore is inherently unsafe.The problematic scenario:
SaveStruct
still encodes them into properties with the field type's zero value.This means that adding fields to an entity effectively poisons the datastore in a way that prevents roll backs.
I understand the behavior, since it does feel like an error that we could be dropping data (though, TBH, if
encoding/json
is okay throwing away data, perhaps so should this library so that it's using familiar idioms/semantics).The bigger problem, in my opinion, is that the APIs do not provide any good way to work around the issue or safely continue processing. The current way to detect this case is to try to type-assert the returned error to
*datastore.ErrFieldMismatch
and then see if theReason
field is"no such struct field"
. Relying on a particular reason string feels particularly brittle. Worse: there is no way to know what properties were unrecognized sinceLoadStruct
only reports the last error. So if there were multiple unrecognized properties, only the last gets reported to the client. It would be far better ifErrFieldMismatch
had anUnknown []Property
field which could be inspected, to gather all unknown properties.I have worked around this by embedding a sort of "mixin" type into every datastore entity struct in my application. However, in order to make this safe and work correctly, I had to fork this client library. My fork does exactly as described above: adds a
Unknown []Property
field to theErrFieldMismatch
type and changesstructPLS.Load
to record all unrecognized properties there. That way the entity struct can actually retain these unrecognized properties and add them to the set of saved properties, in the event the entity is written back to datastore. That way the application does not blow up on unknown properties, but also won't lose them, if an entity is read/updated/written in a transaction.