python-openxml / python-docx

Create and modify Word documents with Python
MIT License
4.52k stars 1.11k forks source link

feature: add custom properties support #91

Open Apteryks opened 10 years ago

Apteryks commented 10 years ago

Add read/write as well as add/delete functionality regarding custom properties.

scanny commented 10 years ago

Hi Maxim, can you say a little more about what these are and how one might want to use them? I'm guessing they're the properties in the app.xml part but not quite sure :)

Apteryks commented 10 years ago

Hello Steve! The custom properties can be accessed/modified by navigating to the File menu --> Properties --> Advanced properties --> Customization (or something similar - rightmost tab) of any MS Office document. These properties are stored in the /docProps/custom.xml file.

They can be useful for storing extra bits of information easily in a document. For example, when generating documents programmatically, it could be necessary to know if the template used is version X or in language Y. The following is a snippet of the custom properties (I've streamlined the file) in one of the documents I'm working with:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties
  xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"
  xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="7"
    name="ProjectId">
    <vt:lpwstr>None</vt:lpwstr>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="9"
    name="ClientName">
    <vt:lpwstr>None</vt:lpwstr>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="10"
    name="DocumentVersion">
    <vt:lpwstr>None</vt:lpwstr>
  </property>
  <property fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" pid="13"
    name="TemplateLanguage">
    <vt:lpwstr>None</vt:lpwstr>
  </property>
</Properties>
scanny commented 10 years ago

Ah, the app.xml, got it.

I have on my list to adapt the Core Properties implementation from python-pptx over to python-docx but haven't quite got around to it yet: https://github.com/scanny/python-pptx/blob/master/pptx/parts/coreprops.py

Looks like this one actually has a lower XML element count.

Any idea what those fmtid GUIDs are?

Apteryks commented 10 years ago

The custom properties I'm talking about live in the file custom.xml (I just checked, this is also true for .pptx files). If your document doesn't have any custom property defined, then you will not see this file.

I could find some information regarding the fmtid, which stands for "Format ID", in the ISO-IEC-29500-1.pdf referenced specification, section 22.3, "Custom Properties". It says the fmtid "uniquely relates a custom property with an OLE property", and "The possible values for this attribute are defined by the ST_Guid simple type (§22.9.2.4)."

scanny commented 10 years ago

Got it, thanks Maxim :)

renejsum commented 8 years ago

I have added custom properties to python-docx. Since custom props are named by the user, I am not using the @Property tag, but implementing it as a dictionary

The interface right now is

document = Document('demo.docx')
print "revision is: " % document.custom_properties["Revision"]
document.custom_properties["Name"] = "Rene"
document.custom_properties["Revision"] = "1.0.0"
document.save("demo.docx")

Steve proposed that I had a look at OPC and implement it there, since it is shared across all Office formats. I will have a look at that.

I am working a bit on and off on this, but have a weeks vacation this week :-)

Apteryks commented 8 years ago

Excellent, renejsum! The interface makes sense to me!

pa2wlt commented 8 years ago

In the reply above @renejsum wrote he added custom_properties to python.docx. Though I can't find it in the code or a corresponding commit. Perhaps @scanny or @Apteryks have any suggestion where to look?

scanny commented 8 years ago

I would look for it in a fork on @renejsum's GitHub page.

pa2wlt commented 8 years ago

Thx @scanny. I did so indeed. Though @renejsum has only one fork, python-opc that has only one contributor... @scanny. The latest commit in is date Oct 2013, and the post of @renejsum above was two years later. Perhaps I looked at the wrong place?

scanny commented 8 years ago

Hmm, sounds like @renejsum hasn't published yet.

What's your use-case/situation?

renejsum commented 8 years ago

Sorry guys, have been a little distracted by other things lately

I have uploaded my version of python-docx to https://github.com/renejsum/python-docx, it was not created as a fork, but I just cloned @scanny 's repo. I am not a git expert, not sure if that will have any implications?

You are welcome to look at my changes and use them however you see fit...

@scanny proposed to implement the customprops in python-opc, but I have not done that due to lack of talent and time :-)

pa2wlt commented 8 years ago

Thx @renejsum and @scanny. Will pick this up later and see if I can get it to work. Use-case/situation: I once created some VBA forms and macro's in Word, that basically do nothing more than provide a somehow user friendly interface for the Custom Document Properties. Though VBA has only been implemented very limited in Office 2016 for Mac. And Microsoft launched the opportunity to create Office add-ins, using html/javascript. I created one with some fields and buttons, and this works fine. However, ... there seems to be no option to read / write the Custom Document Properties. Now I thought that I might give it a try with Python, though I'm not a Python expert.

renejsum commented 8 years ago

Example of usage of my changes can be found at https://gist.github.com/renejsum/0ea621aabd0e4640391ddb3e361f6c02

HeikoNardmann commented 8 years ago

So is there a (more manual) way to do add a custom doc property (apart from using this fork)?

renejsum commented 8 years ago

I am not sure what you mean?

A manual way to add a custom prop to a Word doc

Or a manual way to change the python code?

Br /Rene Den 29. jun. 2016 1.10 PM skrev "HeikoNardmann" notifications@github.com:

So is there a (more manual) way to do add a custom doc property (apart from using this fork)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/python-openxml/python-docx/issues/91#issuecomment-229326545, or mute the thread https://github.com/notifications/unsubscribe/AAjLh5_e2-aCc1nFm95z26Wy4Se_R7sQks5qQlK3gaJpZM4Cb3r_ .

HeikoNardmann commented 8 years ago

A manual way to add a custom prop to a Word doc. Using the official release - not using your more convenient fork code.

renejsum commented 8 years ago

No there is no way to do that. When I get around to do it I Will do a pull req, to get it into the Main branchen

Den 29. jun. 2016 1.43 PM skrev "HeikoNardmann" notifications@github.com:

A manual way to add a custom prop to a Word doc. Using the official release - not using your more convenient fork code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/python-openxml/python-docx/issues/91#issuecomment-229332340, or mute the thread https://github.com/notifications/unsubscribe/AAjLhzrAZ3A6tWcfR5zpubyOODs5XqFaks5qQlpRgaJpZM4Cb3r_ .

Ricyteach commented 8 years ago

The example usage looks great. Any chance this can be added to the main version soon?

renejsum commented 8 years ago

Fine by me, but I am not sure how to do that, maybe @scanny can explain

scanny commented 8 years ago

There are no active developers on python-docx at the moment, so backlog isn't being retired systematically or anything like that. We have a mechanism for sponsored development if you have project budget. And occasionally we get some folks willing to take on the learning curve for the test-driven development to add a feature or two. But otherwise there's no mechanism for new features to periodically appear like lovely surprises. Someone has to invest the work :)

Ricyteach commented 8 years ago

No active developers? Bummer. Wish I knew more so I could just do it myself.

Sealatron commented 6 years ago

I'm wondering what the status of this feature is - for my use case I need to add/modify custom properties of files (the company I work for uses several custom Doc Properties to manage documents).

renejsum commented 6 years ago

It should work in my repo, but has not been ported back to main repo. When I get the time will create a PullRequest to @scanny, but if you have time you are welcome to fork main repo, move my changes and create a pull request...

Sealatron commented 6 years ago

If I can figure out how to do that I might, :). Was looking at your repo and I think it's exactly what I'm looking for, thank you.

jghaines commented 5 years ago

@michael-koeller has put this in a PR https://github.com/python-openxml/python-docx/pull/580

BastienFaure commented 5 years ago

Any information about this issue ? I would love to see the feature merged into the project :+1:

gaurangsn commented 4 years ago

As like other users above, I believe it would be a great addition to python-docx, would appreciate if anyone can provide an update on when this would be available.

geobeo commented 4 years ago

Same here, please merge this feature into the project. It would be amazing to never ever have to see that horrible Word properties edit box ever again!

SiboVG commented 4 years ago

Having the same issue, this feature really is the backbone of one of the projects for my company and seems like a much used feature :)

dani2735 commented 4 years ago

Same issue here. I have tried @renejsum module and it allows to modify fields but it does not recognize headers, footers neither fields update. Is there a possibility to merge both projects?

bogannz commented 4 years ago

Please merge this feature

mohitmathew commented 2 years ago

This is such a useful feature to have. If there are no concerns could you please merge the branch ? This will help create smart documents and will be very very useful.

prdubois commented 2 years ago

+1, this would be quite useful. I need to parse a bunch of docx files and some valuable info is stored in custom properties.

isosphere commented 2 years ago

Thank you for your PR, @michael-koeller. ~Because this project is abandoned,~ Your commit on your repo is now my source for python-docx.

Apteryks commented 2 years ago

@isosphere what makes you say that this project is abandoned? While not actively developed, I believe @scanny is still committed to minimally maintain it so that it can be used with current Python versions.

The problem here as I see it is that some change was made, but no pull request was ever made. The change would need to come with proper test coverage.

Features don't happen magically, they need the dedication of people pushing for them to go through the review process and integrate the main branch. It seems not much is missing here, so perhaps you could volunteer to make it happen?

mohitmathew commented 2 years ago

I disagree

The PR’s were made years ago and not merged tells me that this is abandoned.

On Sun, Jul 24, 2022 at 9:38 PM Apteryks @.***> wrote:

@isosphere https://github.com/isosphere what makes you say that this project is abandoned? While not actively developed, I believe @scanny https://github.com/scanny is still committed to minimally maintain it so that it can be used with current Python versions.

The problem here as I see it is that some change was made, but no pull request was ever made. The change would need to come with proper test coverage.

Features don't happen magically, they need the dedication of people pushing for them to go through the review process and integrate the main branch. It seems not much is missing here, perhaps you could volunteer to make it happen.

— Reply to this email directly, view it on GitHub https://github.com/python-openxml/python-docx/issues/91#issuecomment-1193462254, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBY75EAXY56JZLPZIHJBJLVVXV2VANCNFSM4ATPPL7Q . You are receiving this because you commented.Message ID: @.***>

isosphere commented 2 years ago

@Apteryks I didn't mean any offensive by my comment, but matter of fact-ly the PR was made with test coverage. See https://github.com/python-openxml/python-docx/pull/580 which was made 4 years ago with no comment from maintainers.

I don't mean to denigrate the work done to "minimally maintain" the project, as you say, but my use case requires these features for the library to be useful to me. I don't expect people to work for free, but @michael-koeller did just that and I was thanking them for it.

In hindsight I should have avoided potentially inflammatory words in a GitHub issue. Mea culpa.

Apteryks commented 2 years ago

Oh, I wasn't aware of this PR, thanks for pointing it to me. That changes my view of the issue :-). I'll continue discussion there. Note that I'm not an official maintainer of python-docx, but let's see if we can get the ball rolling!

dmuensterer commented 2 years ago

Hi all, thanks for your effort. In the meantime I’ve been working on supporting app properties as well. I.e. the company or application property that can be set in MS Word. I’ve essentially cloned the core property classes and changed the necessary bits. However I am not able to come by an error that’s thrown when trying to create my AppProperty element. Unfortunately I’m not the best python developer. @Apteryks @michael-koeller would either of you be willing to have a look? I think after supporting all types of properties python-docx would be significantly more convenient!

Apteryks commented 2 years ago

@dmuensterer that looks useful, but I think we should focus on merging https://github.com/python-openxml/python-docx/pull/580 first, perhaps. I've made some review comments there which haven't yet been addressed, but otherwise it looks good!