Open jonas32 opened 2 years ago
Wow, this is such a good PR that makes me consider merging the whole thing before I finish my changeset just for the heck of it. What came of your experiments with the your new async changesets?
We migrated everything to the async branch and haven't experienced any problems yet. That's also the reason why my latest PRs are based on the async branch. Having it based on the current release would make it unavailable for our stuff.
I never made a general independent benchmark since I was missing the time, but the performance I see from our services looks fine.
My (and i guess that counts for many other users) usecase includes using this behind a bigger framework like actix. In that case, actix manages event loops and pools, so the client will run on multiple loops. It still might make sense to add a feature flag where the client does that itself, but that's probably a lot of work again. Even on a single loop, the client still is at least near to the old sync base in terms of performance, depending on the system probably faster. But there is potential for more.
There is also more potential to this PR. Currently all functions of the Client expect Value type parameters. Value itself also implements ToValue, so it would be possible to to change a lot of the functions to accept anything that implements ToValue without breaking the code base even more.
About the failing CI Tests: Looks like the expressions module is broken... Has there been any Server change that could cause that? I don't see that error with Aerospike v5.
any chance u have a Value -> T
deserializer as well?
can we merge this into the async
branch?
@databasedav I'm not so sure that would be a good idea. @jonas32 have taken a practical route in this PR, converting structs to intermediate Bin
s and Value
s, which makes it simple. But it would allocate a lot of single-use objects on the heap. He also had to introduce a new API method for put_struct
which I'm not a fan of. I'd rather have only one method for everything, especially now that we are releasing a major new version. I know it's annoying to wait for the better solution while the practical one already exists, but please give me some time to see if I can engineer a more efficient solution first.
I'm going to improve this one to make it less memory heavy. Did you already start to implement a method for generic input on the value part? Otherwise I would add it here to reduce it to one function again.
Thanks @jonas32 . I have not done anything in this regard, but I don't think this approach would yield the desired outcome, although any improvements are welcome. I think the best way to go about it is to have traits like eg. BinWireWriter
and BinWireReader
so that they would directly read and write from the buffer without needing the intermediate representation. We can even have metadata fields that way to allow generation, expiration and key to be read and written. Of course we'll provide derive macros for that to automate the process. That method would be stack only and would not need to allocate memory at all, outside of what's necessary for the data (strings, vecs, maps, etc.) We can implement those traits for the BinMap
so that it is also a BinWireWriter
and then change the put
so that it takes that trait as input.
I'm going to improve this one to make it less memory heavy. Did you already start to implement a method for generic input on the value part? Otherwise I would add it here to reduce it to one function again.
Something that just occurred to me is that using const generics to return an array instead of a vec. That would be allocated on the stack and theoretically much more efficient. The issue with that is longer compile times and bloated code.
The BinWireWriter idea sounds good to me. So the API functions would accept BinWireWriter and BinWireReader Traits as parameters instead of the usual BinMap and Values, but instead we just implement the Traits for the Value and BinMap structs/enums?
Sounds possible, but I'm not sure if that's in scope of this PR. Maybe worth a second PR to first introduce the Traits and afterwards patching this one. If you want, i can start to work on the Traits first.
The BinWireWriter idea sounds good to me. So the API functions would accept BinWireWriter and BinWireReader Traits as parameters instead of the usual BinMap and Values, but instead we just implement the Traits for the Value and BinMap structs/enums?
Exactly, although I guess we probably have to come up with better names.
Sounds possible, but I'm not sure if that's in scope of this PR. Maybe worth a second PR to first introduce the Traits and afterwards patching this one. If you want, i can start to work on the Traits first.
I agree. I was just bouncing ideas off you and the rest of the community, not particularly asking for changes in this PR. With this new scheme, we could even alias the struct fields to custom bin names and a lot more capabilities.
Yes, i think we could use the serde crate and its derive features as a reference. I agree the names are bit too technical. Most Users probably cant do much with the "Wire" if they never worked on the client or server. I would suggest:
any other suggestions?
How should we handle structs that cant be parsed by the derive macro or need custom logic? I think the trait function will need the buffer as parameter so in theory users could just encode it manually. But I'm not sure if that's not too complicated
Yes it needs the buffer, and that's the best part. For writes, we could provide a write_bin(&Bin)
method for the buffer which would take care of the rest. For reads, we need a read_bins(fields)
method that iterates over the number of fields and then calls the read_bin -> Bin
method that returns the next bin so that the struct can set it. The only issue being that I converted the Bin.name
type to String
, which probably allocates on the heap for the String
.
I looked around yesterday and find this library very useful in understanding how this could be implemented:
I'm thinking about adding a second abstraction layer over the buffer. That could provide "easy" functions for wire writing instead. If we just add the write_bin function to the buffer, the user still gets confronted with all the parsing and writing logic in the buffer implementation. I see a big potential for mistakes by using the "lower level" functions instead of the right ones and encoding invalid data there. But that would introduce overhead again.
I don't see how the user would have access to any of the lower level API on the buffer. Aren't they all private? The only public API would be those required to read and write Bin
s. What am I missing?
no, a lot of them are public. The encoder for operations works that way. https://github.com/aerospike/aerospike-client-rust/blob/master/src/commands/buffer.rs#L1366 https://github.com/aerospike/aerospike-client-rust/blob/master/src/operations/mod.rs#LL124C30-L124C30
If we give the user the ability to directly interact with the buffer instance, they can also access them.
Wow, what a noob mistake on my part. We need to mark them pub(crate)
and call it a day I guess. I'll go through the API visibility in the library at some point and make sure these mistakes are addressed before release.
Now that I think of it, we probably do not need to pass Bin
s to these newly discussed APIs. Simple tuples would do.
I tested a bit how this could be done. Maybe you want to give some thoughts on it before i continue to implement it everywhere. In this version, i completely work around the original encoders and do the encoding completely in the trait implementations. The focus on this was primarily to deprecate Bins and Value and only work in this style. But to keep the API more or less stable, its easily possible to also implement the Traits on Bins and Values like i did in traits.rs.
Currently this is only implemented for WriteCommand. There is a unit test that passes for it, but most other tests are probably broken now.
Here you see the Traits for Bins and Values as Writers. At first i wanted to make it one Trait, but this gives a lot of potential for passing wrong arguments to some functions. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/traits.rs The biggest disadvantage is having two writer functions on values in my opinion. But i didn't find a way to solve this without having a Value step in between yet, since Values are packed differently when they are in a CDT context...
Client is modified to directly accept the Trait instances: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/client.rs#L268
This is the derive part that builds the functions. Nearly all allocations are compile time now. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-macro/src/traits/writable.rs
In the end, it can be used like this: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/tests/src/derive.rs
The code is not beautiful yet and not really optimized. Its just a "working poc" version.
I tested a bit how this could be done. Maybe you want to give some thoughts on it before i continue to implement it everywhere. In this version, i completely work around the original encoders and do the encoding completely in the trait implementations. The focus on this was primarily to deprecate Bins and Value and only work in this style. But to keep the API more or less stable, its easily possible to also implement the Traits on Bins and Values like i did in traits.rs.
Thank you @jonas32 , this is more or less what I had in mind.
The biggest disadvantage is having two writer functions on values in my opinion. But i didn't find a way to solve this without having a Value step in between yet, since Values are packed differently when they are in a CDT context...
I don't think there's a clean way around this, but you could have one writer with a parameter that is passed to it selecting one or the other if you wanted. I prefer the current distinction myself.
Client is modified to directly accept the Trait instances: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-core/src/client.rs#L268
Great. I don't like the idea of having a different set of APIs as much as we can.
This is the derive part that builds the functions. Nearly all allocations are compile time now. https://github.com/jonas32/aerospike-client-rust/blob/async-derive/aerospike-macro/src/traits/writable.rs
In the end, it can be used like this: https://github.com/jonas32/aerospike-client-rust/blob/async-derive/tests/src/derive.rs
The code is not beautiful yet and not really optimized. Its just a "working poc" version.
I believe this is a step in the right direction and we are closer to the final design now. We have to come up with better names for the macros and traits though.
There are a few more ideas I have that we may want to pursue. Off the top of my head:
std::Option<T>
support?ps. Jst a little nit: I changed the [Bin, X]
definitions into a single const generic in the async branch.
Having modifier macros on the fields themselves
yes, definitely. This will be needed. Edit: Check out the Links again. Working now.
We may want to allow easy integration with data validation libraries
I'm not sure if that's in scope for the normal lib. I feel like this is rather the job of a webserver etc. instead of the DB layer. But we could think about adding it with feature flag maybe?
Out of the box std::Option
support?
Yes, will come too. Worked on it a little already. I think ill introduce another function to check if a bin should be parsed at all. That would make Result or Option easy to build.
Flattening the structs
Maybe as a additional field modifier? I'm not a big fan of that, since i think parsing structs from and to maps is a huge use case. For me probably the primary one since building nested structures manually on Values is not very friendly right now.
@jonas32 can WritableBins
/rename
include a compile time check that the bin name is less than the 15 byte maximum? i haven't had the chance to try it out yet and wasn't able to find a check like that here (although i just checked around the panic!
's)
great idea, didn't think of that yet. Will come!
I just found another problem while working on the read macro. Stream read is processed differently from normal read. Normal read initially reads the full buffer and then works on it. Stream read does not do that. It reads the buffer piece by piece as the info is needed, while deleting all the previous data. If i want to do it without copying parts of the buffer and allocating twice, i need to keep the full buffer until the record is parsed. After that, it can be deleted. @khaf do you see any potential problems with that change? Edit: Instead the normal read one could also be changed to reading piece by piece.
As far as i can see, the required full size information should be included in this part of the buffer https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/stream_command.rs#L67
Just to clarify: Do you mean that you keep the buffer engaged until the record is dropped to avoid memory allocation at all cost?
No, the record is not getting dropped. I just need it until the Record is fully parsed. But i already came to the conclusion that its better the other way around and changing the normal read command to smaller Buffers. Now the smaller Buffers with just the Value Data inside get cloned, since they are allocated anyway and the original Buffer will discard them right after the clone. Should result with nearly the same memory usage and makes it a lot easier.
@jonas32 I see, you want to move to a streaming paradigm to reduce the memory usage. That's fantastic, but with caveats:
max(required, min(buf size, message size))
). If Rust supports increasing the buffer size of the socket, you could go that way and just let it buffer for you internally and save yourself some headache, but I have not checked the docs myself.I would love to see the streaming API implemented, that would allow us to have constant sized, small buffers and get rid of one the most annoying parts of the client tuning. As great would be to support this in sending commands as well, circumventing the buffer allocation, sizing, resizing, pooling and deallocation. My initial design of the client was to support this, that's why the buffer struct is there, but we deviated from that idea at some point.
ps. Calling socket.read
in an async system is even worse than a syscall. You'll yield from the coroutine to the runtime scheduler, with all the consequences.
Per this post, the first issue is easily fixable: https://stackoverflow.com/questions/19076719/how-do-i-convert-a-vector-of-bytes-u8-to-a-string
I'm still fully relying on the Buffer and encoder/decoder methods that are in place, so this should already solve it: https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/buffer.rs#L1288 I already tried it that way and it seems to be fine.
I also thought about the consequences of reading so often in small pieces. Ill test some things and check if we can increase the performance there.
I'm still fully relying on the Buffer and encoder/decoder methods that are in place, so this should already solve it: https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/buffer.rs#L1288 I already tried it that way and it seems to be fine.
This still requires an intermediary buffer to read into. My solution would allow reading directly into the string builder's buffer, and would remove the need to have a big intermediary buffer completely. All other data types would require mere bytes of buffer to read, and would work with buffers allocated on stack. Anyway, I still do not understand what you meant, since in the streaming implementation we also read chunks into the buffer and decode them off of it.
I also thought about the consequences of reading so often in small pieces. Ill test some things and check if we can increase the performance there.
I think you should first take the naive approach. Improve after you hit performance issues.
Ah ok, now i get what you mean with the Buffer Strings. I think there its a question of usability vs memory again. If we pre-parse strings at that point, the Trait implementations cant all work with Buffer, but rather enums. At that point, we basically start to reintroduce Value.rs. Additionally, it would not work anymore if any user wants to do custom parsing for String data to an own datatype.
I think we are talking past each other due to my inability to internalize the problem you are facing. When you have your initial POC (even if it does not work), I'll take a look and then we can then talk about the specifics. Does that sound good?
ReadableBins is now working too now. Supports the basic stuff like ints and string, lists, nested lists, options and field annotations (only rename) by now.
While doing that, i found 3 questions.
My full list is this currently (Read/Write):
Ill probably also have a look at Bitwise/HLL later on, but I'm not sure yet if i want to introduce custom Types or Annotations.
While doing that, i found 3 questions.
* First, from what i see, the limit for the buffer is at reclaim_threshold around 65 KB. How are larger blobs handled? Record limit can be up to 8MB, so that would not fit in one pull. Was that what you were talking about originally?
The way that works is that it allocates the data for one command, and then trims back the buffer and releases the excess memory. Not the most efficient, but works if tweaked well by setting reclaim_threshold
to a P95 record size.
* Second, https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/msgpack/decoder.rs#L59 What is the purpose of this kind of ext data? I have never seen it in any Buffer.
That's because of our less-than-standard use of Msgpack. Don't mind it, it is what it is.
* I'm planning to implement Readable/Writable for most every-day use Data Types. How far should we extend that? From what i see in the old packers, Aerospike only cares for i64 and f64. Everything else is casted. Should we cast back to lower values too for reading? I see a risk of wrong Data given to the user with that. Happened to me during testing with f32. Do we also want to implement it for some of the types of the std?
I believe it was and still is a bad idea to automatically cast/convert data, outside of integers. I'm not sure why f32 was even supported. I'd rather remove it. I believe we should not venture outside of what we already have with Values, and for custom converters, it should happen in a rigid and lossless manner.
My full list is this currently (Read/Write):
* i8 - i32 (W)
Can be casted down at runtime, it's lossless. For the one's that don't match the size, a parsing error should be returned.
* i64/isize (R/W) * u8 - u32 (W)
same as above.
* u64/usize (R/W) * f32 (W)
We probably should remove this completely. Let the users cast up/down at their own risk.
* f64 (R/W) * &str (W)
Why no read?
* String (R/W) * bool (R/W) * Option (R/W) * Vec (R/W) * HashMap<T, T> (R/W) * aerospike::Value (R/W)
Is this for GeoJson?
Now that you have made the change, where do I have to look to understand the point you were making before regarding non-streaming buffers?
Ok, limiting the Data Types sounds good to me. I want to keep supporting Value, since its not that easy to build every data layout. Since aerospike does not care for data types inside of CDT contexts, it could become hard to build Lists/Maps otherwise. Also as a compatibility layer to not completely break everything.
About the Buffers: My first idea was to edit https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/stream_command.rs#L83 to make it look like https://github.com/aerospike/aerospike-client-rust/blob/async/aerospike-core/src/commands/read_command.rs#L143 where the whole response is read before touching it. But i figured out myself that this might not be the best idea since this response can grow big. So i did exactly the opposite and changed it for the read command to handle the buffer exactly like the stream command. It now reads the buffer operation for operation. This operations are then cloned out of the buffer into a smaller one, that only holds this one operation (just particle type and the blob itself). This is then appended to the PreParsedValue struct and given to the Trait implementations for further parsing. This way, the trait implementations cann still use the buffer functions for reading the individual value types, just as the Value struct did before.
This PR adds the
ToBins
andToValue
derive macros. Manually parsing large structs into Aerospike Bins or Values is a lot of boilerplate work currently.The macros can simply be attached to any struct. They will add a
to_bins
orto_value
function. Theto_bins
function returns aVec<Bin>
that can be passed to write functions like theclient::put
function. Theto_vec
function builds a HashMap Value for the struct.ToBins is based on ToValue so sub-structs are also parsed. The ToValue trait has implementations for the common data types including variations with HashMaps or Vecs (if i didn't forget any, otherwise please tell me).
I also added a
client::put_struct
method to make it more convenient.Caution, this branch is based on #108 (Async Client)