tarantool / cartridge-springdata

Spring Data Tarantool
Other
18 stars 7 forks source link

RFC: Сonfusing annotation functionality of @Tuple #78

Closed ArtDu closed 2 years ago

ArtDu commented 2 years ago

Problem

Now we have the @Tuple annotation globally solving two different problems unrelated to each other:

  1. Specifying a spacename in POJO(Entity) for the SimpleTarantoolRepository to work correctly (i.e. for the work of predefined functional), so that the functions access the required spacename.
  2. An additional indication in the Repository for functions under the annotation @Query. An indication of what we expect in the response is only a serialized object - Tuple(number indexed table), with the name of the space, to use the metadata to get keys for serialization. Without this @Query specification, the method expects an object (Map, or primitive data type) to be accepted. This is necessary, as I understand it, because converters that can accept both Tuple and other data types have not yet been implemented.

Due to the fact that the annotation performs two different functions, this confuses the user. There are problems with deciding where to specify the annotation, and where not to, and for what purposes. The user can interpret @Tuple as an indication of which space we want to use and its format (metadata), or as an indication of the return type of Tuple (in truth, this may not immediately come to mind).

Suggested ways:

We can go two ways: 1) Change the API a) Break backwards compatibility b) Leave support for existing functionality 2) Write more detailed documentation with all notes

Research

1) Change the API

Suggestions:

Move the work logic to @Query

@Tuple in Repository is hardwired to @Query, so it can only be used in conjunction with @Query. We could move the functionality of @Tuple to @Query by adding additional parameters. Examples:

Break backwards compatibility We can completely remove the old @Tuple logic for Repository, or leave it and not use it as a depricate solution.

2) Document in detail the description

Explicitly indicate in the README, in the important section, or in bold that @Tuple in the Repository is used to identify that the Tarantool output will accept ONLY Tuples for domainClass returnType - i.e. number indexed table. For example, this could be {{val1, val2}} or box.space.get (...) etc. @Tuple in Entity is needed so that Tarantool can use the predefined SimpleTarantoolRepository and access the correct spacename.

In README now:

You can bind repository methods to calls of the stored functions in the Tarantool instance using the @Query annotation with the stored function name specified in the functionName parameter. For such methods, you can specify the stored function response format so that it will be parsed correctly. The response format may be either an object (and a list of objects) or a tuple (and a list of tuples).

The response format may be either an object (and a list of objects) or a tuple (and a list of tuples).

It seems worth clarifying what an object is and what a tuple is. That the response format may not be an object but a map (map, string indexed table), and a tuple is a number indexed table {{val1, val2}}. It is also worth pointing out that we can return primitive data types.

For such methods, you can specify the stored function response format so that it will be parsed correctly

It should be stated explicitly that if we specify @Tuple, then we accept only an array (number indexed tuple). And if we do not specify it, then map (string indexed table) or primitive data.

Also, JavaDoc comments are clearly outdated. https://github.com/tarantool/cartridge-springdata/blob/b230551938351152fe786b59e86fc2a083d397cc/src/main/java/org/springframework/data/tarantool/core/mapping/Tuple.java#L12-L39

PS See also #30 to make API consistent.

ArtDu commented 2 years ago

Feedback after verbal discussion(2022.01.13) with @vrogach2020 @wey1and :

1) Remove @Tuple from the RepositoryInterface (break backward compatibility) 2) Change @Query by default accept Tuple (to be like in SimpleTarantoolRepository) 3) Without flags in @Query. We look at the presence of @Tuple in Entity, if there is, then we expect tuple response (+ we take the schema name). If not, it expects map/primitive.

4) (Separate ticket) We always want to parse metadata if we expect Tuple in the response - {"metadata" : [{'name': 'id', 'type': ''}, {'name': ''}], "rows ": [...]} (not working now), then:

ArtDu commented 2 years ago

Discussed by voice with @wey1and (2022.01.14)

We decided that the solution with the parameter in @Query would still be much more transparent. Those. the schema will be like this:

1) Remove @Tuple from the RepositoryInterface (break backward compatibility) 2) Change @Query to accept Tuple by default (to be like in SimpleTarantoolRepository) 3) We add an additional flag for @Query, which will indicate whether we use a converter for tuples or for objects. Options:

@Query(function = "", receiveTuple = false) // default recieveTuple = true
@Query(function = "", tupleConverter = false) // default tupleConverter = true
@Query(function = "", useTupleConverter = false) // default tupleConverter = true
...

Those if the resulting type has @Tuple(spaceName = "") annotation, then we can use SimpleTarantoolRepository functions and functions with @Query(function = "") annotation

akudiyar commented 2 years ago

Thank you for this detailed RFC! Let me put down some thoughts there, that may help improve it:

This is necessary, as I understand it, because converters that can accept both Tuple and other data types have not yet been implemented.

There were no objectives for combining both tuples and other types in the server response yet. The core problem that required a presence of the Tuple annotation on methods -- providing a possibility for the users to specify what type of the server response they expect. Automatic detection of the response type is not possible since the serialization format is the same for all cases and the internal representation is not distinguishable from the transport (e.g. driver) POV.

Explicitly indicate in the README, in the important section, or in bold that @Tuple in the Repository is used to identify that the Tarantool output will accept ONLY Tuples for domainClass returnType - i.e. number indexed table.

The hard thing to understand here is the fact that: a) The server response format is different for each case internally and has many types (apart from the usual combination of single object / multi object of same type for many other database servers, where most of them are dealing with SQL rows for both variants and most of others use simple structures like JSON). Our protocol is very complicated, since the actual data are buried inside several layers of wrapping that follows down the API layers (IPROTO, crud, Lua responses). We obviously cannot solve this complexity by some smart magic inside the driver, because it requires too much knowledge of the user logic behind the data, so it will either leak inevitably to the surface or will not work for all cases. I'd prefer showing the "nuts and bolts" to the users rather than frustrating them by artificial restrictions. b) We have two directions of serialization actually. That probably needs to be more clarified in README. And when we are specifying an annotation on a POJO class, it tells not only about deserialization, but also about the serialization. We have to support at least three cases: writing tuples, writing maps and writing primitives. That follows the types of API calls that we have. And we cannot write POJOs as tuples without specifying a special annotation, that's how it is done in Spring Data framework. On the way back, when deserializing a value, it is necessary to give the deserializer a hint about how to interpret the msgpack structures it receives as an input, since they all are the same from the protocol POV, but are different from the business logic POV. I don't like much the term "number indexed table" since it is actually wrong: for tuples we have two nested tables (always), while we can either have or not have a wrapping table depending on whether we are receiving a single object or a list of them. Furthermore, a string-indexed table as you are referring to it is a special case of a primitive and it may appear either a container-type value or a response object type.The key thing is that the driver takes responsibility for dealing with serialization/deserialization of the whole diversity of the object types, and here we (or the user) just need to provide the proper hint for the the driver about how to interpret the particular input/output objects.

akudiyar commented 2 years ago
@Query(function = "", receiveTuple = false) // default recieveTuple = true
@Query(function = "", tupleConverter = false) // default tupleConverter = true
@Query(function = "", useTupleConverter = false)

Bool flags is a no way here, use either enums or classes references.

ArtDu commented 2 years ago

I will try to express my thoughts, analysis about these comments.

Consider different response structures from Tarantool:


From these response structures, we can infer:

I don't like much the term "number indexed table" since it is actually wrong: for tuples we have two nested tables (always), while we can either have or not have a wrapping table depending on whether we are receiving a single object or a list of them.

Our protocol is very complicated, since the actual data are buried inside several layers of wrapping that follows down the API layers (IPROTO, crud, Lua responses). We obviously cannot solve this complexity by some smart magic inside the driver, because it requires too much knowledge of the user logic behind the data, so it will either leak inevitably to the surface or will not work for all cases.

The core problem that required a presence of the Tuple annotation on methods -- providing a possibility for the users to specify what type of the server response they expect.

And the problem is that we can enable this validation only to check tuples without metadata (we will assume that metadata from crudResponse does not exist, because we do not use it). For other types(Map, Primitives) there is no validation. The problem that emerges from this is that in enum we can only write something like this

public enum ValidationFormat {
    FLATTEN_TUPLE,
    ANY_EXCEPT_FLATTEN_TUPLE
}

What doesn't look pretty 😞

Proposal

Therefore I propose repeat steps 1, 2 from here https://github.com/tarantool/cartridge-springdata/issues/78#issuecomment-1013149250 and use enum like below: 1) Remove @Tuple from the RepositoryInterface (break backward compatibility) 2) Change @Query to accept Tuple by default (to be like in SimpleTarantoolRepository) 3) We add an additional parameter in @Query

@Query(function = "returning_simple_map", output = ValidationFormat.ANY)

public enum ValidationFormat {
    FLATTEN_TUPLE,
    ANY
}

And either make a ticket, or implement within one PR the ability to accept a tuple(array) as TarantoolTuple or as ArrayList from callForObject.

ArtDu commented 2 years ago

With @vrogach2020 @wey1and decided that now the best solution is to do this: 1) Remove @Tuple from the RepositoryInterface (break backward compatibility) 2) Change @Query to accept Tuple by default (to be like in SimpleTarantoolRepository) 3) We add an additional parameter in @Query

@Query(function = "returning_simple_map", output = TarantoolSerializationType.AUTO)

public enum TarantoolSerializationType {
    TUPLE,
    AUTO
}

And make it so that callForObject(TarantoolSerializationType.AUTO) can accept: TUPLE, MAP, PRIMITIVES and List<Object>, now it can only accept: MAP, PRIMITIVES