Open scanny opened 3 years ago
demo.docx:
ABCD
EFGA
code:
doc = docx.Document('demo.docx')
for par in doc.paragraphs:
isolate_run(par, 0, len(par.runs))
for run in par.runs:
print(run.text)
print:
A
BCD
E
FGA
A
B
C
D
E
F
G
A
If you want each run to be one character long you can use something like:
for start in range(len(paragraph.text)):
end = start + 1
isolate_run(paragraph, start, end)
The len(par.runs)
that appears in your code is the number of runs in the paragraph, which just doesn't have anything to do with what you're trying to do.
The way to think about start and end is like this:
"""
paragraph.text: "ABCDE"
A B C D E
| |
start end
"""
>>> start, end = 1, 4
>>> run = isolate_run(paragraph, start, end)
>>> run.text
'BCD'
Thank you very much for your patience. He will be of great help to me. Thank you!
Thank @scanny . That's awesome
I am having trouble understanding the return Run(r, paragraph).
NameError: name 'Run' is not defined
I have a word document that is already created through different checkboxes selected by the user. Those check boxes get text from plain text files to input into the word document. I would like to add styles to certain runs that I use a series of hyphens to signal. For example, the word document may look like this:
------Details about the incident
Here are the events that detail the specific incident in question. below are the listed sub categories
---This is sub category 1
details about this section
---This is sub category 2
details about this section
I would like any line with "------" to be bold and a larger font, and any line with "---" to just be bold.
https://github.com/python-openxml/python-docx/issues/30#issuecomment-879593691 works great for simply replacing the text (removing the ------ but leaving the text), but any formatting on run sets everything to the same formatting.
I would assume the isolate_run() would work for me, but I cannot get passed the return Run(r, paragraph) to even walk through how to make it work what I need.
Here is how I am calling paragraph_replace_text(doc, '------'):
def formatting(document, oldText):
oldTextLength = len(oldText)
for oldPara in document.paragraphs:
if oldPara.text.find(oldText) >= 0:
paraText = oldPara.text
for line in paraText.splitlines():
if oldText in line:
newText = line[oldTextLength:]
paragraph_replace_text(oldPara, re.compile(f'{oldText}{newText}'), newText)
My thought would be inside paragraph_replace_text I would call isolate_run after the '------' is removed with the start and end in there as the passed variables, but I cant get it to run with the return Run(r, paragraph) to try.
Any help would be appreciated
This is some code I developed to answer this SO question.
You give it a character-position range in a paragraph and it does the needful to isolate that range of characters into its own single run having the same character formatting as the original. If you don't change the text of the paragraph between calls, it can be called repeatedly with different ranges to isolate multiple ranges, like multiple matches to
re.Pattern.findall()
.I'm not sure what will become of it but it was more work than I originally guessed so I want to keep it around for future reference.