ironfede / openmcdf

Microsoft Compound File .net component - pure C# - netstandard 2.0
Mozilla Public License 2.0
308 stars 76 forks source link

Adding custom Word properties #104

Open Sicos1977 opened 7 months ago

Sicos1977 commented 7 months ago

Hi,

I'm trying to add custom word properties to a word.doc file but can't figure out how to do this. Do you have any tips where to look.

I already tried something like this

var compoundFile = new CompoundFile(new FileInfo(@"c:\kees\TEST.doc").OpenRead());
var documentSummaryInformationStream = compoundFile.RootStorage.GetStream("\u0005DocumentSummaryInformation");
var container = documentSummaryInformationStream.AsOLEPropertiesContainer();
var prop = container.NewProperty(VTPropertyType.VT_BOOL, 0x0000000E, "Test");
prop.Value = true;
//container.AddProperty(prop);
container.Save(documentSummaryInformationStream);
compoundFile.SaveAs(@$"d:\{Guid.NewGuid()}.doc");

But I then get this exception

image

With custom I mean these properties image

In the structured storage explorer I see them like this

image

Numpsy commented 7 months ago

May or may not have been fixed by #97 ?

Sicos1977 commented 7 months ago

Nope doesn't work either... and I also have no idea what number I need to use as an identifier for a property

Numpsy commented 7 months ago

Hmm, though I'd added some unit tests when I made that fix, but maybe not.

I think you need to add the new user defined property to the UserDefinedProperties collection in the property container to actually add it to the document though?

Sicos1977 commented 7 months ago

I dont get an error when saving them to the "summaryInformationStream" but then Microsoft Office claims that the file is modified and not safe and opens it in protected mode. Custom properties should be in the documentSummaryInformation stream so that is probably why it is complaining

Sicos1977 commented 7 months ago

I also have no idea what to use for the property identifier..

Sicos1977 commented 7 months ago

It probably has to be something like what I'm doing here --> https://github.com/Sicos1977/MsgKit/blob/master/MsgKit/PropertyTags.cs

Numpsy commented 7 months ago

As it stands the API is a bit rough here because custom properties should only need names, not identifiers... I started trying to look at it as part of #98 and then got snowed under with other things and haven't completed it (hope to have another look shortly, as I want to use the writing side in a Linux app where the Windows platform APIs don't exist)

Sicos1977 commented 7 months ago

That is the exact same reason why I'm using OpenMCDF .... because I'm trying to remove all bindings my software has with Windows. At the moment I still use DSOFile but that relies on Windows API's and is very old.

Numpsy commented 7 months ago

I've been debugging the code for writing the custom property set and have found a few issues - will see about sending some PRs over the weekend

Numpsy commented 7 months ago

I've got a set of changes in https://github.com/Numpsy/openmcdf/tree/users/rw/user_defined that gets it to adding properties that structured storage explorer will read, but Word doesn't like it.

Looking at the binary data in the file, there are some other changes in the written file as compared to the original, possibly something going wrong when writing the DOCPARTS or HEADINGPAIR fields that storage explorer can't display the contents of, but I haven't looked into it yet

Sicos1977 commented 7 months ago

I tried our modifications but it keeps giving me a corrupt Word document. Structered storage explorer is also giving an exception.

Numpsy commented 7 months ago

It might additionally be falling over the issue described at https://github.com/ironfede/openmcdf/issues/98#issuecomment-1725302800 where it may or may not work depending on if the initial document is using LPSTR or LPWSTR for some of the properties :-(

Numpsy commented 7 months ago

Ok, I've found another bug with using the wrong property set guid when writing files, and fixing that has resulted in the Windows Explorer property sheets showing the new custom properties - will do another PR later for that

Numpsy commented 7 months ago

I've pushed another set of changes into https://github.com/Numpsy/openmcdf/tree/users/rw/all_the_changes, seems to be more to Words liking now, but I've only tried it with a couple of files so there might still be issues left.

Also created #109

Numpsy commented 7 months ago

Using the latest code, you can try something like

using (CompoundFile cf = new CompoundFile("2custom.doc"))
{
    var dsiStream = cf.RootStorage.GetStream("\u0005DocumentSummaryInformation");
    var co = dsiStream.AsOLEPropertiesContainer();
    var userProperties = co.UserDefinedProperties;

    var newPropertyId = userProperties.PropertyNames.Keys.Max() + 1;
    userProperties.PropertyNames[newPropertyId] = "MyCustomProperty";

    var newProperty = co.NewProperty(VTPropertyType.VT_LPSTR, newPropertyId);
    newProperty.Value = "SomethingOrOther";
    userProperties.AddProperty(newProperty);

    co.Save(dsiStream);
    cf.SaveAs(@"test_modify_summary.doc");
}

Could still do with something built in to manage the numbering, and a means of adding a new user defined property section to a file that doesn't have one, but I hope the core property parts are working now.