dyz1990 / sevenz-rust

A 7z decompressor/compressor lib written in pure rust
Apache License 2.0
146 stars 24 forks source link

entry's compressed_size is always 0 #49

Closed tkcl81 closed 7 months ago

tkcl81 commented 7 months ago

Hey,

thanks for your great work! One issue I'm facing is that compressed_size is 0 for all entires (this is also the case with the test file in your repo). Is that a known issue?

dyz1990 commented 7 months ago

Since using solid compression, all files are compressed as one block, so the compressed size of a single file cannot be accurately obtained. If you really need to get the compressed size of the file, you can use non-solid compression.

use sevenz_rust::*;
use std::fs::File;
use std::path::Path;
let mut sz = SevenZWriter::create("path/to/dest.7z").expect("create writer ok");
let src = Path::new("path/to/source.txt");
let name = "source.txt".to_string();
// non-solid compression
let entry = sz.push_archive_entry(
              SevenZArchiveEntry::from_path(&src, name),
              Some(File::open(src).unwrap()),
          )
          .expect("ok");
let compressed_size = entry.compressed_size;
sz.finish().expect("done");
tkcl81 commented 7 months ago

Thanks for the feedback! I see 7z is providing that information for sample.7z. I first thought that maybe they apply the same ratio for all compressed sizes, but as expected png compressed much less than txt, so these figures look quite reliable:

Path = sample.7z
Type = 7z
Physical Size = 2748
Headers Size = 220
Method = LZMA2:23
Solid = -
Blocks = 3

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
1601-01-02 23:28:34 .....           13           17  inner/inner.txt
1601-01-02 23:27:37 .....         1417         1388  7ziplogo.png
1601-01-02 23:27:37 .....         2247         1123  7zFormat.txt
------------------- ----- ------------ ------------  ------------------------
1601-01-02 23:28:34               3677         2528  3 files
tkcl81 commented 7 months ago

I've run more tests and the compressed size is always 0 for both solid and regular archives, so looks like a bug.

dyz1990 commented 7 months ago

@tkcl81 I added method push_source_path_non_solid in SevenZWriter and used it to compress the files in src folder. Here is the result:

Listing archive: src.7z

--
Path = src.7z
Type = 7z
Physical Size = 40077
Headers Size = 937
Method = LZMA2:23
Solid = -
Blocks = 25

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2023-12-13 08:40:54 .....         6763         1715  en_funcs.rs
2023-10-16 14:49:03 .....        10933         2646  aes256sha256.rs
2023-12-05 10:28:52 .....         1043          446  lib.rs
2023-08-19 15:08:40 .....         1587          651  error.rs
2023-10-16 14:10:16 .....         4348         1148  writer/unpack_info.rs
2023-12-13 08:36:31 .....         3196         1006  writer/seq_reader.rs
2023-10-16 14:10:40 .....         1547          638  writer/pack_info.rs
2023-10-16 14:49:28 .....         5038         1152  bcj2/mod.rs
2023-10-16 14:49:17 .....         9688         1798  bcj2/bcj2_decode.rs
2023-06-09 18:33:50 .....         4266         1208  wasm.rs
2023-10-22 18:25:15 .....         5489         1548  decoders.rs
2023-05-24 11:00:04 .....         2530          602  bcj/arm.rs
2022-12-30 11:48:25 .....         1447          538  bcj/sparc.rs
2022-12-30 11:48:25 .....         1237          503  bcj/ppc.rs
2023-10-16 14:51:32 .....         4534         1333  bcj/mod.rs
2023-10-16 14:06:46 .....         3129          894  bcj/x86.rs
2023-10-16 14:42:39 .....         1235          500  delta.rs
2023-12-13 08:42:45 .....        22931         4877  writer.rs
2023-10-16 14:49:45 .....         9264         2237  archive.rs
2023-10-16 14:51:54 .....         4998         1233  de_funcs.rs
2023-10-16 14:40:48 .....         3051          940  folder.rs
2023-06-01 11:35:37 .....         1008          435  password.rs
2023-10-16 14:35:50 .....         1537          485  method_options.rs
2023-10-22 18:25:20 .....         4070         1167  encoders.rs
2023-12-06 16:10:01 .....        54800         9440  reader.rs
------------------- ----- ------------ ------------  ------------------------
2023-12-13 08:42:45             169669        39140  25 files

And the result of using push_sourse_path (solid mode):

Listing archive: src.7z

--
Path = src.7z
Type = 7z
Physical Size = 28759
Headers Size = 791
Method = LZMA2:23
Solid = +
Blocks = 1

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2023-12-13 08:40:54 .....         6763        27968  en_funcs.rs
2023-10-16 14:49:03 .....        10933               aes256sha256.rs
2023-12-05 10:28:52 .....         1043               lib.rs
2023-08-19 15:08:40 .....         1587               error.rs
2023-10-16 14:10:16 .....         4348               writer/unpack_info.rs
2023-12-13 08:36:31 .....         3196               writer/seq_reader.rs
2023-10-16 14:10:40 .....         1547               writer/pack_info.rs
2023-10-16 14:49:28 .....         5038               bcj2/mod.rs
2023-10-16 14:49:17 .....         9688               bcj2/bcj2_decode.rs
2023-06-09 18:33:50 .....         4266               wasm.rs
2023-10-22 18:25:15 .....         5489               decoders.rs
2023-05-24 11:00:04 .....         2530               bcj/arm.rs
2022-12-30 11:48:25 .....         1447               bcj/sparc.rs
2022-12-30 11:48:25 .....         1237               bcj/ppc.rs
2023-10-16 14:51:32 .....         4534               bcj/mod.rs
2023-10-16 14:06:46 .....         3129               bcj/x86.rs
2023-10-16 14:42:39 .....         1235               delta.rs
2023-12-13 08:42:45 .....        22931               writer.rs
2023-10-16 14:49:45 .....         9264               archive.rs
2023-10-16 14:51:54 .....         4998               de_funcs.rs
2023-10-16 14:40:48 .....         3051               folder.rs
2023-06-01 11:35:37 .....         1008               password.rs
2023-10-16 14:35:50 .....         1537               method_options.rs
2023-10-22 18:25:20 .....         4070               encoders.rs
2023-12-06 16:10:01 .....        54800               reader.rs
------------------- ----- ------------ ------------  ------------------------
2023-12-13 08:42:45             169669        27968  25 files
tkcl81 commented 7 months ago

Thanks. To get myself clear - the issue I see is that sevenz-rust always returns 0 for the compressed size when reading archives - this happens both with sample.7z (non-solid) from the examples directory and any file created with the 7z tool.

dyz1990 commented 7 months ago

Thanks. To get myself clear - the issue I see is that sevenz-rust always returns 0 for the compressed size when reading archives - this happens both with sample.7z (non-solid) from the examples directory and any file created with the 7z tool.

@tkcl81 Sorry, I understood it wrong before. Now I have fixed this issue on v0.5.4

tkcl81 commented 7 months ago

Thank you for the fix, it's working now :-)