Open Kami opened 5 years ago
This would be great even to use vanilla python long
s that do not fit in a signed int64
(such as an uint64
value).
Currently _set_protobuf_value
(called in entity_to_protobuf
) breaks with ValueError: Value out of range
.
As a user, I could wrap such values in a custom class to prevent using the default integer
in favor of the actual uint64
.
@Kami would you mind sharing some hints around your "very hack-ish local version"? đŸ˜„
Right now the Datastore Python client library provides a high-level convenience API for working with native Python types by utilizing a special Entity class which behaves like a dictionary.
Underneath, the client library converts this Entity and native Python types to Entity Protobuf object which is what Google Datastore API expects and works with.
This works great when only working with Python and don't care about actual object schema, but it brakes down if you want to build some kind of cross-programming language ORM with strict schema and work with the Datastore from multiple programming languages.
There are multiple possible approaches and options when building a programming language agnostic ORM and entity model schema for Datastore, but in the end, you need code to translate your ORM objects into Entity Protobuf objects with which Datastore gRPC API works with.
In our scenario, we decided to use Protobuf as schema for database models (aka objects / entities which are stored inside the datastore). To be able to accomplish that, we developed libraries which translate arbitrary Protobuf message objects to Entity Protobuf objects.
Here is an example of such translator library for Python - https://github.com/Kami/python-protobuf-cloud-datastore-entity-translator
To store this Entity Protobuf object inside the datastore, we have two options:
1. Utilize low level gRPC / Protobuf based Python client
Utilizing low level gRPC based Python client (https://github.com/GoogleCloudPlatform/google-cloud-datastore/tree/master/python) is not ideal, because it requires a lot of glue code in our app.
We need to take care of things such as transactions, rollbacks, etc.
Basically, we need to re-invent the wheel and build a library which is very similar to this high level client library, but works with Entity Protobuf objects instead of Entity classes.
2. Utilize high level client and waste CPU cycles on unnecessary conversion round-trips
Another option is to directly utilize this high level client library.
The problem is that this library doesn't expose primitives for working directly with Entity Protobuf objects.
This means we need to do something along those lines:
To make our life and life of other people who have similar problems easier (if you search around the internet, you will see there are more people who have similar problems), I think it makes sense to expose public methods which work directly with Entity Protobuf objects in this high level client library.
This should be relatively straight forward since the client library already works with Entity Protobuf objects in the background.
It will mostly just require some code shuffling around / refactoring and adding new methods for working directly with Entity Protobuf objects. It would also mean very little additional code to maintain since most of the primitives are already in place today.
Having said that, I propose adding the following new methods which would work directly with Entity Protobuf objects:
client.put_entity_pb
client.get_entity_pb
client.get_multi_entity_pb
query.fetch_entity_pb
I'm happy to implement those changes.
In fact, I already have a very hack-ish version locally which I plan to push later and open a pull request so we can start more concrete discussion about the actual code changes.