Lonami / lzxd

https://crates.io/crates/lzxd
Apache License 2.0
14 stars 3 forks source link

What do "reference data" and "subject data" refer to? #1

Closed jdm closed 10 months ago

jdm commented 4 years ago

https://docs.rs/lzxd/0.1.0/lzxd/enum.WindowSize.html refers to the reference data and the subject data, but I'm having trouble figuring out what that means, and how to determine the window size ahead of time. The context here is that I'm trying to integrate lzxd into this code where I have the size of the compressed data and the decompressed size.

Lonami commented 4 years ago

how to determine the window size ahead of time

It should be "agreed" beforehand between the party who provides the compressed file and the party interested in decompressing it. That is, you can't determine it ahead of time (unless you're the one compressing it, then you decide, but this library can't compress LZXD yet).

The library was designed for XNB files in mind and I haven't had the chance to test it anywhere else (surprisingly most LZXD implementations out there don't implement compression either :P), so it might need some adjustments to get it to work in other files. PRs or sample files are more than welcome!

The WindowSize talks about reference and subject data because it's pretty much extracted from the official documentation, 2.1.2 Window Size:

The sliding window size MUST be a power of 2, from 2^17 (128 kilobytes (KB)) up to 2^25 (32 megabytes (MB)). The window size is not stored in the compressed data stream and MUST be specified to the decoder before decoding begins. The window size SHOULD be the smallest power of two between 2^17 and 2^25 that is greater than or equal to the sum of the size of the reference data rounded up to a multiple of 32,768 and the size of the subject data.

To be completely honest, I'm not exactly sure what those refer to since most of the document talks about compression but the library used the reference decompression which is just a section in the entire document.

jdm commented 4 years ago

Luckily, the code I'm integrating it into is actually for parsing XNB files.

Lonami commented 4 years ago

Would you mind sharing it? I was porting xnbcli to Rust myself but didn't get very far / didn't publish what I had.

As for the issue, I'm not sure if there is a reason to keep it open, maybe beyond trying to improve the documentation.

jdm commented 4 years ago

I linked the code in the original comment, but I'll push the latest changes that attempt to integrate this crate and hit an error on the compressed Stardew Valley xnbs that xnbcli can decompress.

jdm commented 4 years ago

The documentation in https://docs.microsoft.com/en-us/openspecs/exchange_server_protocols/ms-patch/5dfc82a3-f31b-48a8-ab4c-7e9cbf8ece9b did help me better understand that reference data and subject data appear to be (more or less) unrelated to the compressed data. I stole the WindowSize value that I needed from xnbcli.

Lonami commented 4 years ago

I linked the code in the original comment

Oh right my bad :)

I stole the WindowSize value that I needed from xnbcli.

Yeah basically I think it's guess work unless you had access to SDV source to find what value the use. Do you think we could improve the documentation or should we close the issue?

jdm commented 4 years ago

It might be worth calling out in the documentation something like this:

The window size is a value that must be calculated based on the original compression settings that was used. When decompressing data for which the original settings are unknown, the only known method for choosing a window size is guess and check.
Lonami commented 4 years ago

Feel free to send a pull request, if you like.

Lonami commented 10 months ago

Closing due to inactivity (but by all means feel free to still send a PR if you desire).