bczhc / bzip3-rs

Rust wrapper for bzip3 compression library
GNU Lesser General Public License v3.0
1 stars 1 forks source link

Add test to check if chained encoders finishes #4

Closed marcospb19 closed 1 year ago

marcospb19 commented 1 year ago

Hey, I do believe this test should pass, but it hangs indefinitely.

The test chains 5 Encoders and writes on the top one, the result should be random data in the format .bz3.bz3.bz3.bz3.bz3, but it runs forever.

If you edit the chain number to 1, it succeeds immediately, try running:

cargo test --release -- chain

This is the smaller test I saw failing while I was doing a bigger one with encoder and decoders.

Idk, it's technically possible that bzip3 makes assumptions about buffers and I can't chain them like this.

marcospb19 commented 1 year ago

Related to #1, in Ouch it's possible to chain any number of formats when compressing or decompressing a file/archive.

bczhc commented 1 year ago

Good catch! This is a bug! Thanks for finding it. I'm on the way fixing them... For the tests, I hope {read, write}::{Bz3Encoder, Bz3Decoder} with a variety of block_size can all be covered. Now your 5 chained encoder test can pass. I'm working to fix and test another three cases.

bczhc commented 1 year ago

e2ed1991d55e7eb4f46063b3ab5207bf1901cdc9. all chained cases seem to be OK. I use an eof flag to solve this, just not coming up with a more elegant approach...

marcospb19 commented 1 year ago

For reference, here is the original test I wrote, it chains write decoders with read encoders.

Do you want to commit the tests yourself? Otherwise I can edit this PR and try merging one of the two tests I've shown you, Whatever you prefer, let me know :pray:.

(Your responses are lightning quick :zap::zap::zap:, no need to rush though!)

marcospb19 commented 1 year ago

Oh, note that I set the random-data length to be bigger than the block size, that's to force Bz3Encoders to make multiple reads, if length was less than block size, I believe a single read for each item in the chain would be enough, but I wanted to test the "interleaving" calls of Read between the encoders, going back and forth.

I used:

block size = 70kb
random data length = 200kb

One thing I just realized is that only the first one might be forced to write it all, because compressed data is shrunk down.

200KB -> ReadEncoder1 (reads 200KB) -> ReadEncoder2 (reads X) -> ReadEncoder3 (reads Y)

If X and Y are below 70kb, then it's a single-block read, so we might have to do some tests and raise the random data length.

Maybe having two different tests... one with smaller input for single-block processing, and one with bigger input for multi-block processing.

bczhc commented 1 year ago

Feel free to write the tests as your intention and needs, finish the PR thanks!

---- Replied Message ---- | From | João @.> | | Date | 06/12/2023 22:52 | | To | @.> | | Cc | Zhai @.>@.> | | Subject | Re: [bczhc/bzip3-rs] Add test to check if chained encoders finishes (PR #4) |

For reference, here is the original test I wrote, it chains write decoders with read encoders.

Do you want to commit the tests yourself? Otherwise I can edit this PR and try merging one of the two tests I've shown you, Whatever you prefer, let me know 🙏.

(Your responses are lightning quick ⚡⚡⚡, no need to rush though!)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

marcospb19 commented 1 year ago

Hey, I'm back from a long gaming break :nerd_face::video_game: , can you re-review?

Test passes since your last fix.

marcospb19 commented 1 year ago

Oops forgot to rebase.

bczhc commented 1 year ago

Oh great! LGTM, thanks!

marcospb19 commented 1 year ago

With this merged I think I can go back to the Ouch side, I was getting a bug with chained formats and I think that was the cause.

I'll keep you updated on my progress in the other issue.