Dijji / XstReader

Xst Reader is an open source viewer for Microsoft Outlook’s .ost and .pst files, written entirely in C#. To download an executable of the current version, go to the releases tab.
Microsoft Public License
479 stars 70 forks source link

"Sub-nodes of sub-nodes not yet implemented" crash in pst file with embedded message attachment #35

Closed hughbe closed 3 years ago

hughbe commented 3 years ago

See https://filebin.net/owmuw1r7tdum67a6 for Outlook.pst file (GitHub doesn't allow me to upload pst files)

Folder: Root>"Top of Personal Folders">"Inbox" Message: subject of "this message contains embedded MSG" Attachment: index 0 Property: PidTagAttachDataBinary

This property has dwValueHnid of 128 which is of type HID.

Reading the data we get nid of 2097572, which is of type NORMAL_MESSAGE with index 65549.

The discovered node has a dataBid of 8256 and subDataBid of 8250

This causes a crash in https://github.com/Dijji/XstReader/blob/master/NDB.cs#L104

        // Read raw data, accessed via a sub node
        // If it has a multiblock structure, return all of the blocks' contents concatenated
        public byte[] ReadSubNodeDataBlock(FileStream fs, BTree<Node> subNodeTree, NID nid)
        {
            if (!nid.HasValue)
                return null;
            var n = LookupSubNode(subNodeTree, nid);
            if (n == null)
                throw new XstException("Node not found in sub node tree");
            if (n.SubDataBid != 0)
                throw new XstException("Sub-nodes of sub-nodes not yet implemented");
            return ReadDataBlock(fs, n.DataBid);
        }
hughbe commented 3 years ago

image


        public Message OpenAttachedMessage(Attachment a)
        {

            using (FileStream fs = ndb.GetReadStream())
            {
                BTree<Node> subNodeTreeMessage = a.subNodeTreeProperties;

                if (subNodeTreeMessage == null)
                    // No subNodeTree given: assume we can look it up in the main tree
                    ndb.LookupNodeAndReadItsSubNodeBtree(fs, a.Parent.Nid, out subNodeTreeMessage);

                var subNodeTreeAttachment = ltp.ReadProperties<Attachment>(fs, subNodeTreeMessage, a.Nid, pgAttachmentContent, a);
                if (a.Content.GetType() == typeof(PtypObjectValue))
                {
                    // I added this to demo the API
                    ndb.ReadSubNodeDataBlock(fs, subNodeTreeAttachment, new NID(((PtypObjectValue)a.Content).Nid));
                    ...
Dijji commented 3 years ago

I downloaded your pst file, and using the out of the box UI am able to open both of the messages attached to the 'this message contains embedded…' message.

I assume, therefore, that the crash only occurs when the code is modified as described in your second comment. If that is the case, then are you asking me to debug your change?

hughbe commented 3 years ago

Oh sorry I should have clarified that I’m using your code as an API in a library I’m developing. Essentially I called

ndb.ReadSubNodeDataBlock(fs, subNodeTreeAttachment, new NID(((PtypObjectValue)a.Content).Nid));

And this triggered the Not Yet Implemented bug. So this does not affect the UO

What I am asking is if it is possible to fix the NYI in the readsubNodeDataBlock API to support cases like this

The use case for me is to be able to read embedded data programmatically (i.e not thru UI)

Btw thank you so much for the library it is exceptionally useful and helpful in understanding the PST file format

Dijji commented 3 years ago

I think that what you are actually finding is that the call to read a sub node that you are making is not actually correct at the point you are making it.

The fact that the standard UI reads the attached message successfully means that there is in fact no block to reading such attachments in the code. I do recall when I was testing this scenario in the first place that getting the right sub node at the right time was pretty tricky.

My suggestion is that you track the path through the code taken when the UI opens the attached message, and find out where you are going wrong in making your call.

hughbe commented 3 years ago

I think you're right from a UI point of view - I'm just seeing if we could possibly implement more functionality in this library. My use case is that I'm using the code as a library (i.e. not UI).

You're probably right that I am calling the API incorrectly. After all, as you pointed out you can actually read these files in the UI.

However, I would point out that throwing because something is not implemented appears to me to be a defect in itself. I would argue that this should be fixed - although it is not a priority. Note that the exception message is "Sub-nodes of sub-nodes not yet implemented"

Maybe one (untested) fix is

public byte[] ReadSubNodeDataBlock(FileStream fs, BTree<Node> subNodeTree, NID nid)
{
    if (!nid.HasValue)
        return null;
    var n = LookupSubNode(subNodeTree, nid);
    if (n == null)
        throw new XstException("Node not found in sub node tree");
    if (n.SubDataBid != 0)
    {
        var childSubNodeTree = ReadSubNodeBTree(fs, n.SubDataBid);
        return ReadSubNodeDataBlock(fs, childSubNodeTree, nid);
    }
    return ReadDataBlock(fs, n.DataBid);
}
Dijji commented 3 years ago

Note that the function that throws the exception returns a single buffer with the contents of any multiblock structure concatenated. So I don't like your fix because it returns either the contents of the node or the contents of the sub node. If it were to be implemented, it would have to concatenate the node and sub node contents as part of a single buffer. However, it is not at all clear how this might be done: are the sub node contents interpolated, placed at the end, or what?

I think the function is correct as it stands because of this lack of clear meaning. Also, I think that the fact that the scenario has not arisen is a good indicator that it is not required.

Also note that you can get more detailed access by invoking the alternative function that returns an enumeration of data blocks.

hughbe commented 3 years ago

OK. You make some good points. I'll close this issue :)