wcampbell0x2a / backhand

Library and binaries for the reading, creating, and modification of SquashFS file systems
Apache License 2.0
112 stars 10 forks source link

How to show progress while creating a squashfs image? #273

Closed hwittenborn closed 1 year ago

hwittenborn commented 1 year ago

I'm trying to show progress in a way similar to mksquashfs does. With mksquashfs is shows each byte as it's being written, but I'm unsure how I'd do that with this crate.

hwittenborn commented 1 year ago

I guess what I'm looking for is a way to figure out what the resulting image's size will be before calling FilesystemWriter::write. I'm not sure how mksquashfs does it, but is there any way to calculate that ahead of time if that's what's needed?

wcampbell0x2a commented 1 year ago

I pushed https://github.com/wcampbell0x2a/backhand/pull/274, which should ( I haven't tested this enough) give the amount of uncompressed bytes that are compressed into the image.

I think squashfs-tools/unsquashfs just gives the amount of uncompressed bytes that it read, as it's almost impossible (?) to know how many bytes the compressed size will be before you compress the bytes. It then increments the progress bar as it compressed those bytes.

Time willing, I want to create my own mksquashfs: https://github.com/wcampbell0x2a/backhand/issues/21

As far as the progress bar as you write bytes, you will just need to have your own wrapper around Write, which understands and increments the progress bar as you call the wrapper around write().

hwittenborn commented 1 year ago

I'm a bit confused on how I'd use that to get a progress bar going @wcampbell0x2a. Sorry if I'm overlooking something, but I'd need to know how many uncompressed bytes have already been written in order to do what I'm needing, right? I've tried wrapping Write, but getting buf.len() from Write::write appears to be showing the number of compressed bytes that are about to be written out, not the number of uncompressed.

You mentioned that FilesystemWriter::uncompressed_size would list the size of uncompressed files, but that doesn't include (among other things) the bytes to store the compression used, right?

The way I'm thinking it'd be done is that FilesystemWriter::write could support specifying a closure (maybe even under FilesystemWriter::write_callback if that'd be better too), which will get called every time the Write trait object is called internally. The closure could be passed how many uncompressed bytes have been written, and what the total number of uncompressed bytes that will be in the SquashFS image (including things like the bytes for compression info).

Let me know if I'm just overlooking something too, I was just throwing out some ideas from what I saw.

hwittenborn commented 1 year ago

Honestly I don't even need to know the size of compression info, I think it's a good enough metric to know the total size of all files and the size of all files that have been written. I'm planning on just providing a percentage of progress done anyway, so it doesn't really matter to me.

I'm looking at the output of mksquash and it doesn't look like they use any specific things for the numbers used in the progress bars anyway, so leaving out miscellaneous squashfs metadata is more than fine by me.

wcampbell0x2a commented 1 year ago

I stepped away from my computer again, havent really read your reply thoroughly.

Instead of wrapping the Write, I think you could wrap the Read instead. Read at the point of being used by my DataWriter should be only the uncompressed bytes.

hwittenborn commented 1 year ago

I didn't think about that, I could definitely just see as bytes are read and then total it up like that.

I've just got two questions though:

I'm assuming that might be how it's done already, but I think it might be good to document that as otherwise I think it'd be easy to not assume, and then not know how to implement some kind of progress reporting.

hwittenborn commented 1 year ago

Can confirm my custom Read wrapper works as intended, I just read file sizes when opening up the files and then have a separate Arc<Mutex<u64>> that I track the number of bytes read in.

This issue's effectively fixed for me, as long as those two questions I had can be confirmed then it'll be good to get closed :)

wcampbell0x2a commented 1 year ago

Is there any guarantee that Read calls will only happen when FilesystemWriter::write is called?

I think once FilesystemWriter is created all Read calls will be correctly counted. But if you are dealing with FilesystemReader and Squashfs they will be using that same reader.

Is there any guarantee that a call to the Read object won't try to read the file more than once?

Yes, it's allowed to Seek around but items should be only read once and reading more would be a bug if found. Having said that, I don't have a mksquashfs so I don't know for certain this is a thing!

wcampbell0x2a commented 1 year ago

Interestingly, this progress bar library has a wrap_read already: https://docs.rs/indicatif/latest/indicatif/struct.ProgressBar.html#method.wrap_read

I'll close this unless you have more questions.

hwittenborn commented 1 year ago

Oh cool, I didn't realize that library had that. My use case actually uses indicatif, so I'll look into porting my code over to that. I'm all good on my end though, thanks a ton for helping me get this resolved!