mdsteele / rust-cfb

Rust library for reading/writing Compound File Binary (structured storage) files
MIT License
46 stars 20 forks source link

Malformed directory two red nodes in a row #10

Closed ikrivosheev closed 3 years ago

ikrivosheev commented 3 years ago

I'm getting this error on doc file when I try to open it for reading:

Custom { kind: InvalidData, error: "Malformed directory (two red nodes in a row)" }

MS-DOC says that

2.1 File Structure 
A Word Binary File is an OLE compound file as specified by [MS-CFB]. The file consists of the 
following storages and streams. 

File: file-sample_100kB.doc.zip

mdsteele commented 3 years ago

Thanks for the report.

Short version: This file seems to technically violate the MS-CFB spec. But it's in a way that doesn't really matter, so the cfb crate ought to still be able to read it. I'll make a change.

Long version: Each directory entry in a CFB file has a color bit that can be "red" or "black;" this is intended to be used so that the directory tree can be maintained as a balanced red-black tree. Strangely, the usual red-black tree balancing rule (all paths must have the same number of black nodes) is totally optional; the MS-CFB spec section 2.6.4 makes clear that it's perfectly valid to just color all nodes black and leave the tree unbalanced (which is what the cfb crate currently does). However, the spec also says "Two consecutive nodes MUST NOT both be red," and in this file, it looks like all the nodes in the tree are colored red.

But...there's no real reason for the cfb crate to complain about this when reading the file, since it doesn't even really use the color bit right now. (And if we ever do want to use the color bit, we could just make it fix the invalid tree when modifying the file, rather than refusing to open and read the file in the first place.) So I'll make a change to stop enforcing that part of the spec.

ikrivosheev commented 3 years ago

@mdsteele, thank you very match for the answer and for fix!

Can you make fix release?

mdsteele commented 3 years ago

I've just published v0.6.1 with this fix.