Closed Barinzaya closed 1 year ago
Sure thing! Let me know if you have any comments or suggestions.
@Barinzaya That's awesome work you did there 🐎 !
I'll try to find some time in the next days, maybe next week, to get a deeper look into the changes and give it a proper review. Thanks in advance!
Edit: I deleted the original comment, because I accidentally made that using my work account.
I simplified the encode_args
benchmarks to simplify the packets being encoded and focus it more on the arguments (was bundle(1 message(args))
, now just message(args)
). For some reason, this seems to have improved the performance on the branch further (expected, since there are fewer steps to the encoding), but for some reason didn't seem to have any effect on the benchmark time on master
, much to my surprise. This makes the improvement numbers look nicer, but also makes me wonder if the benchmarks I added are missing some case that accounts for that difference.
Just wanted to make an open note of it, since it gives me the feeling of fluffing the numbers, which wasn't the intention at all.
@Barinzaya Can you share the command (if there is some) that you used to create the benchmark table from https://github.com/klingtnet/rosc/pull/44#issue-1617160005 ?
Did you use https://github.com/BurntSushi/cargo-benchcmp ?
Update
I tried cargo-benchcmp
and what I did was
$ git rebase --exec 'cargo bench | tee bench-$(git rev-parse HEAD)' origin/master
to run a benchmark for each commit. Afterwards you can do something like
$ cargo benchcmp 3146ce4cf539bf88e638261ee1d6bd9c104420aa fabeccd7d73c8416706804007acb0e0d89c6f673 --regressions
to check for regressions. However, the benchmark could not be run for all of the commits, so I think we might need to squash some of them. And, I couldn't reproduce similar speedups to yours, yet.
As far as the benchmark comparison table, I created them manually. Ran cargo bench
on master
and copied the output, then switched to encoder-improvements
and ran it again and copied its outputs. I made a copy of the encoder_bench
file and removed the encode_into
tests for use on master
. The relative speeds were also done manually (old time / new time).
I probably could have at least used a spreadsheet or something, but there were only a few benchmarks when I started and I never bothered to set one up. There might be other, more suitable tools out there too.
Commits have been squashed.
Unsquashed commits are still available in the encoder-improvements+unsquashed
branch on the fork's repository.
Looks good to me. I'm going to merge this in the next few days and then prepare a release. @Barinzaya Good work 👍🏻
Added one more commit, which adds an encoder unit test using encode_into
to write to a Vec via a Cursor, using the WriteOutput
struct. I feel this is somewhat important, as Output
implementation for WriteOutput
is not covered by the existing unit tests.
That makes this 5 commits, one per file touched. I can squash them down a bit further (e.g. combine the commit for the unit test change with the commit for the encoder changes) if preferred.
This PR reworks the implementation of the
encoder
module to accomplish 2 goals:Improve the performance of the
encoder
logic. The existing implementation performs a lot of small allocations due to creating and passing around around smallVec
andString
objects internally. The reworked implementation passes through a reference to the output, and all data is written directly to the output; no internal buffers are allocated, making performance much faster. The performance improvement from this is significant, improving performance by 3-50x in my results from stress-test benchmarks that I created alongside these changes (sample results below). In a practical project in which I'm usingrosc
, these changes have decreased the amount of time spent encoding OSC packets to about 1/8 of what it was before (from ~150 us per packet to ~18 us) with no code changes in that project.Make the output more flexible. This PR adds an
Output
trait which allows the user control over what is done with the output.Output
is implemented forVec<u8>
, allowing for a packet to easily be encoded into a Vec. Forstd
, there is also aWriteOutput
newtype which allows anystd::io::Seek + std::io::Write
type to be used as anOutput
, allowing OSC data to be encoded directly to aCursor
,File
, etc. These changes also includeencode_into
andencode_string_into
functions, which allow the user to pass in a mutable reference to any object implementing thisOutput
trait, allowing the user to control how the encoded data is handled.The existing public functions have not had their signatures altered, only new public items have been added. Thus, according to the SemVer Compatibility guide, these changes should only qualify as a minor change. Additionally, these changes do not incur a MSRV bump (1.52 is the minimum supported version according to
cargo-msrv
, both with and without these changes). Thus, these changes don't require a major version bump.This also fixes unit tests not building without the
std
feature. All existing unit tests are passing, both with and withoutstd
. No additional unit tests have been added.Benchmark results on my machine for comparison:
master
(Windows)encoder-improvements
(Windows)master
(Linux)encoder-improvements
(Linux)I find the discrepancy between the
encode
benchmarks and theinto_new_vec
benchmarks interesting, because they should be doing almost the exact same thing, and yet theencode
benchmark is noticeably faster in most cases (except for thehuge
benchmark).