apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.37k stars 1.3k forks source link

Support as_foundationdb_key interface objects in Python tuple layer #4230

Open alexkreidler opened 3 years ago

alexkreidler commented 3 years ago

Currently, users can write

obj = # instance of class with as_foundationdb_key() method
db[obj] = "test"
tr[obj] ="transaction_test"

However, when trying to use an object with an as_foundationdb_key method with the Subspace or Tuple layer, one gets ValueError: Unsupported data type. For example:

fdb.tuple.pack(("some_space", 1, obj))

# or
subspace[obj] = "test_subpace"
Full error traceback ``` Traceback (most recent call last): File "./test.py", line 20, in g.add((donna, RDF.type, FOAF.Person)) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/rdflib/graph.py", line 389, in add self.__store.add((s, p, o), self, quoted=False) File "/home/alex/c2/tdb/rdflib_foundationdb/store.py", line 168, in add tr_add(self.db, triple, self.store) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/impl.py", line 266, in wrapper ret = func(*largs, **kwargs) File "/home/alex/c2/tdb/rdflib_foundationdb/store.py", line 79, in tr_add xs = subspace[b"spo"][spo] File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/subspace_impl.py", line 35, in __getitem__ return Subspace((name,), self.rawPrefix) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/subspace_impl.py", line 29, in __init__ self.rawPrefix = fdb.tuple.pack(prefixTuple, prefix=rawPrefix) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 401, in pack res, version_pos = _pack_maybe_with_versionstamp(t, prefix) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 385, in _pack_maybe_with_versionstamp child_bytes, version_pos = _reduce_children(map(_encode, t)) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 291, in _reduce_children for child_bytes, child_pos in child_values: File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 366, in _encode child_bytes, version_pos = _reduce_children(map(lambda x: _encode(x, True), value)) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 291, in _reduce_children for child_bytes, child_pos in child_values: File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 366, in child_bytes, version_pos = _reduce_children(map(lambda x: _encode(x, True), value)) File "/home/alex/miniconda/envs/tdb/lib/python3.7/site-packages/fdb/tuple.py", line 370, in _encode raise ValueError("Unsupported data type: " + str(type(value))) ValueError: Unsupported data type: ```

To improve the consistency and ease of use of the API, I propose

I'd love any thoughts or feedback. I'm open to writing a PR if it's welcome.

Thanks for the project!

jzhou77 commented 3 years ago

Note there is also the as_foundationdb_value method. It might be debatable whether as_foundationdb_value or as_foundationdb_key should be chosen.

sfc-gh-abeamon commented 3 years ago

I suppose it's possible there could be a separate method to use for tuple encoding the object rather than either of these. You could imagine an advanced version of this that lets you fully encode and decode it with a custom type code (using one of the ones reserved for custom types), or just one that lets you return a type that is already tuple encodable.

alexkreidler commented 3 years ago

@jzhou77 Yes, that makes sense, we should use as_foundationdb_key for the key and as_foundationdb_value for the value

@sfc-gh-ajbeamon In fact, I did something similar to work around this: I created a method as_tuples() -> tuple on my classes and then simply wrote subspace[obj.as_tuples()]. Then I didn't have to pack and unpack myself. I also implemented as_foundationdb_key() on the class just by calling fdb.tuple.pack(self.as_tuples()), although I'm realizing now that I didn't use as_foundationdb_key anywhere else in the code.

I guess based on this I'd propose:

  1. use an as_tuples (or some other name) method if it is available and handle it as if it were a direct tuple returned
  2. if that's not available and as_foundationdb_key or as_foundationdb_value (depending on the context) is, use those methods and handle it as bytes

The only thing missing is a method that returns a representation as a type that is directly supported by the tuple layer: e.g. a str or int or long. This could be a separate method like as_tuple_type, or potentially just be another type returned by the one as_tuples method.

LMK how this sounds.

sfc-gh-abeamon commented 3 years ago

Is this proposal saying that the new method as_tuple would returned a packed tuple byte string in any context where as_foundationdb_key or as_foundationdb_value could be used?

I think my preference would be that if a user wants this behavior in a key or value context, they would use the existing functions as_foundationdb_* and return the packed tuple from them, as you did on your class. To get the behavior of treating the type as a packable tuple type, having the separate as_tuple_type method that returns either just bytes or any packable type sounds like a good solution.

I'm not sure if there's a use-case that doesn't address, so if there is let me know.