Adding derive macros to parse structs to bins and values

jonas32 commented 2 years ago

This PR adds the ToBins and ToValue derive macros. Manually parsing large structs into Aerospike Bins or Values is a lot of boilerplate work currently.

The macros can simply be attached to any struct. They will add a to_bins or to_value function. The to_bins function returns a Vec<Bin> that can be passed to write functions like the client::put function. The to_vec function builds a HashMap Value for the struct.

ToBins is based on ToValue so sub-structs are also parsed. The ToValue trait has implementations for the common data types including variations with HashMaps or Vecs (if i didn't forget any, otherwise please tell me).

I also added a client::put_struct method to make it more convenient.

Caution, this branch is based on #108 (Async Client)

khaf commented 2 years ago

Wow, this is such a good PR that makes me consider merging the whole thing before I finish my changeset just for the heck of it. What came of your experiments with the your new async changesets?

jonas32 commented 2 years ago

We migrated everything to the async branch and haven't experienced any problems yet. That's also the reason why my latest PRs are based on the async branch. Having it based on the current release would make it unavailable for our stuff.

I never made a general independent benchmark since I was missing the time, but the performance I see from our services looks fine.

My (and i guess that counts for many other users) usecase includes using this behind a bigger framework like actix. In that case, actix manages event loops and pools, so the client will run on multiple loops. It still might make sense to add a feature flag where the client does that itself, but that's probably a lot of work again. Even on a single loop, the client still is at least near to the old sync base in terms of performance, depending on the system probably faster. But there is potential for more.

There is also more potential to this PR. Currently all functions of the Client expect Value type parameters. Value itself also implements ToValue, so it would be possible to to change a lot of the functions to accept anything that implements ToValue without breaking the code base even more.

About the failing CI Tests: Looks like the expressions module is broken... Has there been any Server change that could cause that? I don't see that error with Aerospike v5.

databasedav commented 1 year ago

any chance u have a Value -> T deserializer as well?

databasedav commented 1 year ago

can we merge this into the async branch?

khaf commented 1 year ago

@databasedav I'm not so sure that would be a good idea. @jonas32 have taken a practical route in this PR, converting structs to intermediate Bins and Values, which makes it simple. But it would allocate a lot of single-use objects on the heap. He also had to introduce a new API method for put_struct which I'm not a fan of. I'd rather have only one method for everything, especially now that we are releasing a major new version. I know it's annoying to wait for the better solution while the practical one already exists, but please give me some time to see if I can engineer a more efficient solution first.

jonas32 commented 1 year ago

I'm going to improve this one to make it less memory heavy. Did you already start to implement a method for generic input on the value part? Otherwise I would add it here to reduce it to one function again.

khaf commented 1 year ago

Thanks @jonas32 . I have not done anything in this regard, but I don't think this approach would yield the desired outcome, although any improvements are welcome. I think the best way to go about it is to have traits like eg. BinWireWriter and BinWireReader so that they would directly read and write from the buffer without needing the intermediate representation. We can even have metadata fields that way to allow generation, expiration and key to be read and written. Of course we'll provide derive macros for that to automate the process. That method would be stack only and would not need to allocate memory at all, outside of what's necessary for the data (strings, vecs, maps, etc.) We can implement those traits for the BinMap so that it is also a BinWireWriter and then change the put so that it takes that trait as input.

khaf commented 1 year ago

I'm going to improve this one to make it less memory heavy. Did you already start to implement a method for generic input on the value part? Otherwise I would add it here to reduce it to one function again.

Something that just occurred to me is that using const generics to return an array instead of a vec. That would be allocated on the stack and theoretically much more efficient. The issue with that is longer compile times and bloated code.

jonas32 commented 1 year ago

The BinWireWriter idea sounds good to me. So the API functions would accept BinWireWriter and BinWireReader Traits as parameters instead of the usual BinMap and Values, but instead we just implement the Traits for the Value and BinMap structs/enums?

Sounds possible, but I'm not sure if that's in scope of this PR. Maybe worth a second PR to first introduce the Traits and afterwards patching this one. If you want, i can start to work on the Traits first.

khaf commented 1 year ago

The BinWireWriter idea sounds good to me. So the API functions would accept BinWireWriter and BinWireReader Traits as parameters instead of the usual BinMap and Values, but instead we just implement the Traits for the Value and BinMap structs/enums?

Exactly, although I guess we probably have to come up with better names.

Sounds possible, but I'm not sure if that's in scope of this PR. Maybe worth a second PR to first introduce the Traits and afterwards patching this one. If you want, i can start to work on the Traits first.

I agree. I was just bouncing ideas off you and the rest of the community, not particularly asking for changes in this PR. With this new scheme, we could even alias the struct fields to custom bin names and a lot more capabilities.

jonas32 commented 1 year ago

Yes, i think we could use the serde crate and its derive features as a reference. I agree the names are bit too technical. Most Users probably cant do much with the "Wire" if they never worked on the client or server. I would suggest:

general derive macro to mark structs as Read/Write for the client
general macros for record meta (or maybe with dedicated function params?)
Additional macro on field level to rename fields
Additional macro on field level to imply Value type. (HashMap as geojson for example)

any other suggestions?

jonas32 commented 1 year ago

How should we handle structs that cant be parsed by the derive macro or need custom logic? I think the trait function will need the buffer as parameter so in theory users could just encode it manually. But I'm not sure if that's not too complicated

khaf commented 1 year ago

Yes it needs the buffer, and that's the best part. For writes, we could provide a write_bin(&Bin) method for the buffer which would take care of the rest. For reads, we need a read_bins(fields) method that iterates over the number of fields and then calls the read_bin -> Bin method that returns the next bin so that the struct can set it. The only issue being that I converted the Bin.name type to String, which probably allocates on the heap for the String.

I looked around yesterday and find this library very useful in understanding how this could be implemented:

https://github.com/not-fl3/nanoserde

jonas32 commented 1 year ago

I'm thinking about adding a second abstraction layer over the buffer. That could provide "easy" functions for wire writing instead. If we just add the write_bin function to the buffer, the user still gets confronted with all the parsing and writing logic in the buffer implementation. I see a big potential for mistakes by using the "lower level" functions instead of the right ones and encoding invalid data there. But that would introduce overhead again.

khaf commented 1 year ago

I don't see how the user would have access to any of the lower level API on the buffer. Aren't they all private? The only public API would be those required to read and write Bins. What am I missing?

jonas32 commented 1 year ago

no, a lot of them are public. The encoder for operations works that way. https://github.com/aerospike/aerospike-client-rust/blob/master/src/commands/buffer.rs#L1366 https://github.com/aerospike/aerospike-client-rust/blob/master/src/operations/mod.rs#LL124C30-L124C30

If we give the user the ability to directly interact with the buffer instance, they can also access them.

khaf commented 1 year ago

Wow, what a noob mistake on my part. We need to mark them pub(crate) and call it a day I guess. I'll go through the API visibility in the library at some point and make sure these mistakes are addressed before release.

Now that I think of it, we probably do not need to pass Bins to these newly discussed APIs. Simple tuples would do.

jonas32 commented 1 year ago

I tested a bit how this could be done. Maybe you want to give some thoughts on it before i continue to implement it everywhere. In this version, i completely work around the original encoders and do the encoding completely in the trait implementations. The focus on this was primarily to deprecate Bins and Value and only work in this style. But to keep the API more or less stable, its easily possible to also implement the Traits on Bins and Values like i did in traits.rs.

Currently this is only implemented for WriteCommand. There is a unit test that passes for it, but most other tests are probably broken now.

Here you see the Traits for Bins and Values as Writers. At first i wanted to make it one Trait, but this gives a lot of potential for passing wrong arguments to some functions. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/traits.rs The biggest disadvantage is having two writer functions on values in my opinion. But i didn't find a way to solve this without having a Value step in between yet, since Values are packed differently when they are in a CDT context...

Client is modified to directly accept the Trait instances: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/client.rs#L268

This is the derive part that builds the functions. Nearly all allocations are compile time now. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-macro/src/traits/writable.rs

In the end, it can be used like this: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/tests/src/derive.rs

The code is not beautiful yet and not really optimized. Its just a "working poc" version.

khaf commented 1 year ago

I tested a bit how this could be done. Maybe you want to give some thoughts on it before i continue to implement it everywhere. In this version, i completely work around the original encoders and do the encoding completely in the trait implementations. The focus on this was primarily to deprecate Bins and Value and only work in this style. But to keep the API more or less stable, its easily possible to also implement the Traits on Bins and Values like i did in traits.rs.

Thank you @jonas32 , this is more or less what I had in mind.

The biggest disadvantage is having two writer functions on values in my opinion. But i didn't find a way to solve this without having a Value step in between yet, since Values are packed differently when they are in a CDT context...

I don't think there's a clean way around this, but you could have one writer with a parameter that is passed to it selecting one or the other if you wanted. I prefer the current distinction myself.

Client is modified to directly accept the Trait instances: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/client.rs#L268

Great. I don't like the idea of having a different set of APIs as much as we can.

This is the derive part that builds the functions. Nearly all allocations are compile time now. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-macro/src/traits/writable.rs

In the end, it can be used like this: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/tests/src/derive.rs

The code is not beautiful yet and not really optimized. Its just a "working poc" version.

I believe this is a step in the right direction and we are closer to the final design now. We have to come up with better names for the macros and traits though.

There are a few more ideas I have that we may want to pursue. Off the top of my head:

Having modifier macros on the fields themselves. Things for field aliases and type conversions (Aerospike is schemaless, so if someone changed a type from int to float, or date strings to unix timestamps, etc, this could help them a ton with backwards compatibility, assuming that we have a general custom trait for the conversion as well).
We may want to allow easy integration with data validation libraries (or do we?)
Metadata field markers (TTL/Expiration, Generation)
Out of the box std::Option<T> support?
Flattening the structs: So if we have an embedded struct, it would flatten the fields into the upper container (recursively) in write, and lift them up during read. (Apparently helps with refactoring and is pretty popular in the Go world)

ps. Jst a little nit: I changed the [Bin, X] definitions into a single const generic in the async branch.

jonas32 commented 1 year ago

Having modifier macros on the fields themselves

yes, definitely. This will be needed. Edit: Check out the Links again. Working now.

We may want to allow easy integration with data validation libraries

I'm not sure if that's in scope for the normal lib. I feel like this is rather the job of a webserver etc. instead of the DB layer. But we could think about adding it with feature flag maybe?

Out of the box std::Option support?

Yes, will come too. Worked on it a little already. I think ill introduce another function to check if a bin should be parsed at all. That would make Result or Option easy to build.

Flattening the structs

Maybe as a additional field modifier? I'm not a big fan of that, since i think parsing structs from and to maps is a huge use case. For me probably the primary one since building nested structures manually on Values is not very friendly right now.

databasedav commented 1 year ago

@jonas32 can WritableBins/rename include a compile time check that the bin name is less than the 15 byte maximum? i haven't had the chance to try it out yet and wasn't able to find a check like that here (although i just checked around the panic!'s)

jonas32 commented 1 year ago

great idea, didn't think of that yet. Will come!

jonas32 commented 1 year ago

I just found another problem while working on the read macro. Stream read is processed differently from normal read. Normal read initially reads the full buffer and then works on it. Stream read does not do that. It reads the buffer piece by piece as the info is needed, while deleting all the previous data. If i want to do it without copying parts of the buffer and allocating twice, i need to keep the full buffer until the record is parsed. After that, it can be deleted. @khaf do you see any potential problems with that change? Edit: Instead the normal read one could also be changed to reading piece by piece.

As far as i can see, the required full size information should be included in this part of the buffer https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/stream_command.rs#L67

khaf commented 1 year ago

Just to clarify: Do you mean that you keep the buffer engaged until the record is dropped to avoid memory allocation at all cost?

jonas32 commented 1 year ago

No, the record is not getting dropped. I just need it until the Record is fully parsed. But i already came to the conclusion that its better the other way around and changing the normal read command to smaller Buffers. Now the smaller Buffers with just the Value Data inside get cloned, since they are allocated anyway and the original Buffer will discard them right after the clone. Should result with nearly the same memory usage and makes it a lot easier.

khaf commented 1 year ago

@jonas32 I see, you want to move to a streaming paradigm to reduce the memory usage. That's fantastic, but with caveats:

It does not really work for strings and blobs unless you have a way to allocate their buffers in advance. We need some kind of string builder that allows pre-allocation.
It increases the number of times you'll have to read from the socket, which would increase the number of syscalls. That significantly increases the latency as I learned in the Go client. I had to implement a kind of a buffered chunk reader to reduce the number of syscalls. The way I did it was to have a fixed buffer, and then when running out of data, that reader would read a big chunk ( max(required, min(buf size, message size)) ). If Rust supports increasing the buffer size of the socket, you could go that way and just let it buffer for you internally and save yourself some headache, but I have not checked the docs myself.

I would love to see the streaming API implemented, that would allow us to have constant sized, small buffers and get rid of one the most annoying parts of the client tuning. As great would be to support this in sending commands as well, circumventing the buffer allocation, sizing, resizing, pooling and deallocation. My initial design of the client was to support this, that's why the buffer struct is there, but we deviated from that idea at some point.

ps. Calling socket.read in an async system is even worse than a syscall. You'll yield from the coroutine to the runtime scheduler, with all the consequences.

khaf commented 1 year ago

Per this post, the first issue is easily fixable: https://stackoverflow.com/questions/19076719/how-do-i-convert-a-vector-of-bytes-u8-to-a-string

jonas32 commented 1 year ago

I'm still fully relying on the Buffer and encoder/decoder methods that are in place, so this should already solve it: https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/buffer.rs#L1288 I already tried it that way and it seems to be fine.

I also thought about the consequences of reading so often in small pieces. Ill test some things and check if we can increase the performance there.

khaf commented 1 year ago

I'm still fully relying on the Buffer and encoder/decoder methods that are in place, so this should already solve it: https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/buffer.rs#L1288 I already tried it that way and it seems to be fine.

This still requires an intermediary buffer to read into. My solution would allow reading directly into the string builder's buffer, and would remove the need to have a big intermediary buffer completely. All other data types would require mere bytes of buffer to read, and would work with buffers allocated on stack. Anyway, I still do not understand what you meant, since in the streaming implementation we also read chunks into the buffer and decode them off of it.

I also thought about the consequences of reading so often in small pieces. Ill test some things and check if we can increase the performance there.

I think you should first take the naive approach. Improve after you hit performance issues.

jonas32 commented 1 year ago

Ah ok, now i get what you mean with the Buffer Strings. I think there its a question of usability vs memory again. If we pre-parse strings at that point, the Trait implementations cant all work with Buffer, but rather enums. At that point, we basically start to reintroduce Value.rs. Additionally, it would not work anymore if any user wants to do custom parsing for String data to an own datatype.

khaf commented 1 year ago

I think we are talking past each other due to my inability to internalize the problem you are facing. When you have your initial POC (even if it does not work), I'll take a look and then we can then talk about the specifics. Does that sound good?

jonas32 commented 1 year ago

ReadableBins is now working too now. Supports the basic stuff like ints and string, lists, nested lists, options and field annotations (only rename) by now.

While doing that, i found 3 questions.

First, from what i see, the limit for the buffer is at reclaim_threshold around 65 KB. How are larger blobs handled? Record limit can be up to 8MB, so that would not fit in one pull. Was that what you were talking about originally?
Second, https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/msgpack/decoder.rs#L59 What is the purpose of this kind of ext data? I have never seen it in any Buffer.
I'm planning to implement Readable/Writable for most every-day use Data Types. How far should we extend that? From what i see in the old packers, Aerospike only cares for i64 and f64. Everything else is casted. Should we cast back to lower values too for reading? I see a risk of wrong Data given to the user with that. Happened to me during testing with f32. Do we also want to implement it for some of the types of the std?

My full list is this currently (Read/Write):

i8 - i32 (W)
i64/isize (R/W)
u8 - u32 (W)
u64/usize (R/W)
f32 (W)
f64 (R/W)
&str (W)
String (R/W)
bool (R/W)
Option (R/W)
Vec (R/W)
HashMap<T, T> (R/W)
aerospike::Value (R/W)

Ill probably also have a look at Bitwise/HLL later on, but I'm not sure yet if i want to introduce custom Types or Annotations.

khaf commented 1 year ago

While doing that, i found 3 questions.

* First, from what i see, the limit for the buffer is at reclaim_threshold around 65 KB. How are larger blobs handled? Record limit can be up to 8MB, so that would not fit in one pull. Was that what you were talking about originally?

The way that works is that it allocates the data for one command, and then trims back the buffer and releases the excess memory. Not the most efficient, but works if tweaked well by setting reclaim_threshold to a P95 record size.

* Second, https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/msgpack/decoder.rs#L59
  What is the purpose of this kind of ext data? I have never seen it in any Buffer.

That's because of our less-than-standard use of Msgpack. Don't mind it, it is what it is.

* I'm planning to implement Readable/Writable for most every-day use Data Types. How far should we extend that?
  From what i see in the old packers, Aerospike only cares for i64 and f64. Everything else is casted. Should we cast back to lower values too for reading? I see a risk of wrong Data given to the user with that. Happened to me during testing with f32.
  Do we also want to implement it for some of the types of the std?

I believe it was and still is a bad idea to automatically cast/convert data, outside of integers. I'm not sure why f32 was even supported. I'd rather remove it. I believe we should not venture outside of what we already have with Values, and for custom converters, it should happen in a rigid and lossless manner.

My full list is this currently (Read/Write):
* i8 - i32 (W)

Can be casted down at runtime, it's lossless. For the one's that don't match the size, a parsing error should be returned.

* i64/isize (R/W)

* u8 - u32 (W)

same as above.

* u64/usize (R/W)

* f32 (W)

We probably should remove this completely. Let the users cast up/down at their own risk.

* f64 (R/W)

* &str (W)

Why no read?

* String (R/W)

* bool (R/W)

* Option (R/W)

* Vec (R/W)

* HashMap<T, T> (R/W)

* aerospike::Value (R/W)

Is this for GeoJson?

Now that you have made the change, where do I have to look to understand the point you were making before regarding non-streaming buffers?

jonas32 commented 1 year ago

Ok, limiting the Data Types sounds good to me. I want to keep supporting Value, since its not that easy to build every data layout. Since aerospike does not care for data types inside of CDT contexts, it could become hard to build Lists/Maps otherwise. Also as a compatibility layer to not completely break everything.

About the Buffers: My first idea was to edit https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/stream_command.rs#L83 to make it look like https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/read_command.rs#L143 where the whole response is read before touching it. But i figured out myself that this might not be the best idea since this response can grow big. So i did exactly the opposite and changed it for the read command to handle the buffer exactly like the stream command. It now reads the buffer operation for operation. This operations are then cloned out of the buffer into a smaller one, that only holds this one operation (just particle type and the blob itself). This is then appended to the PreParsedValue struct and given to the Trait implementations for further parsing. This way, the trait implementations cann still use the buffer functions for reading the individual value types, just as the Value struct did before.

aerospike / aerospike-client-rust

Adding derive macros to parse structs to bins and values #126