mdsteele / rust-msi

Rust library for reading/writing Windows Installer (MSI) files
MIT License
58 stars 11 forks source link

Stream does not find #6

Closed ikrivosheev closed 3 years ago

ikrivosheev commented 3 years ago

I found strange behavior... Simple code:

let mut archive = msi::Package::open(file).unwrap();
let streams = archive.streams().collect::<Vec<String>>();
for name in streams {
    let stream = archive.read_stream(&name);
    if let Err(e) = stream {
        println!("{:?}", e);
    }
}

Output:

Custom { kind: NotFound, error: "Stream \"\\u{5}DigitalSignature\" does not exist" }
Custom { kind: NotFound, error: "Stream \"\\u{5}MsiDigitalSignatureEx\" does not exist" }

msi file to reproduce: test.msi.zip.

mdsteele commented 3 years ago

Thanks for the report. I think I've figured out what's happening, but I'm not sure yet which of several possible solutions is most correct.

Short version: These stream names are being stored unencoded, which is confusing the msi crate—they appear in the list of streams, but when you try to read the stream, the msi crate encodes the name and then doesn't find the encoded name in the underlying CFB file, so it thinks it doesn't exist.

Longer version: Stream names in the underlying CFB file are stored as UTF-16. To save space, stream names in MSI files are normally encoded, with up to two ASCII characters packed into each UTF-16 character (see https://github.com/mdsteele/rust-msi/blob/master/src/internal/streamname.rs). (Side note: I've never found anywhere where this scheme is documented. If there's a public doc that describes the encoding, I'd love to see it.)

However, there's an exception for the special "\u{5}SummaryInformation" stream that appears in every MSI file, whose name is not encoded, and which the msi crate special-cases (it is excluded from the list of streams, and its data is accessed instead through the summary_info() method). It seems that "\u{5}DigitalSignature" and "\u{5}MsiDigitalSignatureEx" are also special streams whose names are not encoded. (Maybe that's what the \u{5} prefix is supposed to signify? I don't know of anywhere that that's documented, either.)

mdsteele commented 3 years ago

I did some more digging, and at least found some more clues as to the meaning of the DigitalSignature and MsiDigitalSignatureEx data (this code comment in particular was illuminating). I think the best solution is for the msi crate to exclude these from the list of streams, and provide higher-level methods for dealing with them, just as it already does for SummaryInformation.

ikrivosheev commented 3 years ago

@mdsteele thank you for fixes, can you make small release?

mdsteele commented 3 years ago

Sure; v0.4.0 is now published.