python-openxml / python-docx

Create and modify Word documents with Python
MIT License
4.64k stars 1.13k forks source link

left_indent not getting populated when reading existing document #569

Open eddo888 opened 6 years ago

eddo888 commented 6 years ago

Hi, I am reading an existing document and I am not seeing the left_indent being populated.

Consider a document OneTwoThree.docx with text

One
    Two
        Three

the sample code below does not show the left_indent values

from docx import Document

doc = Document('OneTwoThree.docx')

for p in doc.paragraphs:
    print('%s:%s'%(p.paragraph_format.left_indent,p.text))
    print('%s:%s'%(p.style.paragraph_format.left_indent,p.text))
eddo888 commented 6 years ago

This works to create an indented document

from docx import Document
from docx.shared import Inches

doc = Document()

p1 = doc.add_paragraph('One')

p2 = doc.add_paragraph('Two')
p2.paragraph_format.left_indent = Inches(1)

p3 = doc.add_paragraph('Three')
p3.paragraph_format.left_indent = Inches(2)

for p in doc.paragraphs:
    print('%s:%s'%(p.paragraph_format.left_indent,p.text))

doc.save('eddo.docx')

This reads the indented document

from docx import Document
from docx.shared import Inches

doc = Document('eddo.docx')

for p in doc.paragraphs:
    print('%s:%s'%(p.paragraph_format.left_indent,p.text))

however I want to see "tab indents" ?

eddo888 commented 6 years ago

eddo.docx Here is an example using tab stops that I can't see the tab stops for

eddo888 commented 6 years ago

I think it has something to do with numbered or bulleted lists.

eddo888 commented 4 years ago

is anybody monitoring this thread ?

scanny commented 4 years ago

What do you mean "not getting populated"? Show both code and results for unexpected behavior.

eddo888 commented 4 years ago

What do you mean "not getting populated"? Show both code and results for unexpected behavior.

please scroll up and see my previous comments with code and examples.

eddo888 commented 2 years ago

could you find my comments ?

Cerebex commented 3 months ago

@scanny I am also having this issue where p.paragraph_format.left_indent and p.style.paragraph_format.left_indent is None even when it should have a value intermittently throughout my docx file.

Is there any other paragraph property that would house this left_indent value that I am missing?

scanny commented 3 months ago

See the documentation here: https://python-docx.readthedocs.io/en/latest/api/text.html#docx.text.parfmt.ParagraphFormat.left_indent

I think it's safe to say that if you're getting None as the value then the left-indent is not set explicitly on that paragraph. If you're seeing a left indent then it must be coming from further up the style hierarchy, perhaps from a numbering format or document default. You'll have to inspect the XML and see.

So far, python-docx supports inspecting the paragraph and the paragraph style for this value.

Cerebex commented 3 months ago

Thanks for the quick feedback. Would this xml be connected to the paragraph object at all? Do you have a direction I could start with to find this left indent in the xml?

scanny commented 3 months ago

The details of the XML for paragraph formatting are described here: https://python-docx.readthedocs.io/en/latest/dev/analysis/features/text/paragraph-format.html#indentation

You could possibly unzip the .docx package and do a text search for w:ind.

A paragraph style can inherit from another "base" style and that inheritance can be multiple layers deep, so it's possible it's up that inheritance chain somewhere. So looking in the word/styles.xml member of the package might be a good start.

scanny commented 3 months ago

Another possible approach is to navigate the paragraph style hierarchy as far as it goes. A ParagraphStyle object has a .base_style -> ParagraphStyle | None property. You can advance up through that branch of the style hierarchy by looking at each "parent" style until .base_style is None. The first (lowest) level that specifies a property determines the effective value. That's probably the easiest first step.

But the style hierarchy is not quite as simple as that. There are other factors, like a Table Style etc. and document defaults. If the simple approach doesn't work you'd need to search around on Word style inheritance and probably do some experimentation to find out what order the various nodes in the style hierarchy are traversed by Word.