Open scanny opened 10 years ago
I'm trying to get this working maybe you can help:
XML works just like above, but I'm not sure how to add a URL to the docs References and get the refId.
Could someone point me in the right direction?
This method in the test code might be a help: https://github.com/python-openxml/python-docx/blob/master/tests/opc/test_package.py#L328
The critical call will be something like:
rId = part.relate_to(url, reltype, is_external=True)
Then the value of rId
will replace your 'rId5' above.
For the main document, you can get a reference to the document part using: document._document_part
.
reltype
can vary, but a regular hyperlink uses docx.opc.contants.RELATIONSHIP_TYPE.HYPERLINK
. I usually use this to get those references:
from docx.opc.constants import RELATIONSHIP_TYPE as RT
foo = RT.HYPERLINK
You can check an example document using opc-diag to see what relationship type URI it uses in your particular case if you think it might be different.
Thank you for the fast reply -- I'll try it out and get back to you soon...
I got it working, thanks so much! Could you please give me some quick feedback?
https://github.com/robertdodd/python-docx/commit/1aa3860fcbbe8675b23b06ec9144136dc8b88e24
I passed the document to paragraph.add_hyperlink
so the relationship could be created. Is there a way to use the document without passing it around?
hyperlink = p.add_hyperlink(document, text='Google', url='http://google.com')
I also want to update the URL manually -- but you need a reference to the document to do that and I'm not sure where to put it.
hyperlink.text = 'Google'
hyperlink.url = 'http://google.com'
@robertdodd Any progress on this feature? I see you went back and forth and got some feedback from @scanny.
Hey @collinstocks -- If you want to use this now I got it working roughly over here. There are also internal hyperlinks over here thanks to Anton.
I got great feedback from @scanny and Anton but I've been a bit caught up recently and haven't finished implementing it yet. I should get some time soon -- and hopefully we'll see it merged!
@robertdodd: Same question as @collinstocks and a note. I'm using your implementation and it works quite well. However, the hyperlinks are generated without any styling associated. (Looks like plain text) Maybe address that before releasing this feature?
@robertdodd, @AKimZ: I continue @robertdodd's work and adding styles and multiple runs for the same link are possible now. See https://github.com/tanyunshi/python-docx/commit/6b9d40b8edf5959f7891f019a248514c691ae07e
@tanyunshi Thank you! This is awesome. I wonder if you'll be able to merge what you have with the most recent version of python_docx (since we can adjust the color of text to be blue).
@johnzupancic, hihi, merge done here https://github.com/tanyunshi/python-docx/commit/90237e81a61810c7272c73fbef8edeb6b8be63bd
Hi guys,
Looks like adding hyperlinks now possible, but what about reading them from paragraph? I have a problem that after reading text from paragraph I missed hyperlinks. Can I do this using changes which done in this thread?
Hi @scanny When this feature implementation on main repo?
Is adding a hyperlink now supported?
I'm curious if this will be implemented in the main repo as well. Otherwise great work on the project and the documentation is actually really useful.
@scanny Any chance of this being merged into the main repo?
Hi @Courthold , I dont think this will be merged into the main repo as it lacks tests and the API has not been vetted. Here comes the dicussion https://github.com/python-openxml/python-docx/pull/162.
I think there were some problemes in the implementation(see also @gordeychuk).
For anyone needing a workaround you can use this function. Note that it only let you write a hyperlink, you won't be able to modify the link without going back down to the lxml level.
def add_hyperlink(paragraph, url, text):
"""
A function that places a hyperlink within a paragraph object.
:param paragraph: The paragraph we are adding the hyperlink to.
:param url: A string containing the required url
:param text: The text displayed for the url
:return: A Run object containing the hyperlink
"""
# This gets access to the document.xml.rels file and gets a new relation id value
part = paragraph.part
r_id = part.relate_to(url, RT.HYPERLINK, is_external=True)
# Create the w:hyperlink tag and add needed values
hyperlink = OxmlElement('w:hyperlink')
hyperlink.set(qn('r:id'), r_id, )
hyperlink.set(qn('w:history'), '1')
# Create a w:r element
new_run = OxmlElement('w:r')
# Create a new w:rPr element
rPr = OxmlElement('w:rPr')
# Create a w:rStyle element, note this currently does not add the hyperlink style as its not in
# the default template, I have left it here in case someone uses one that has the style in it
rStyle = OxmlElement('w:rStyle')
rStyle.set(qn('w:val'), 'Hyperlink')
# Join all the xml elements together add add the required text to the w:r element
rPr.append(rStyle)
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
# Create a new Run object and add the hyperlink into it
r = paragraph.add_run()
r._r.append(hyperlink)
# A workaround for the lack of a hyperlink style (doesn't go purple after using the link)
# Delete this if using a template that has the hyperlink style in it
r.font.color.theme_color = MSO_THEME_COLOR_INDEX.HYPERLINK
r.font.underline = True
return r
Great job! How I can make hyperlink inside file to other paragraph ?
How I can make hyperlink inside file to other paragraph ?
It would be best to unzip a word document and figure out whats needed. Personally, to figure the above out I made documents with only the required feature in it, unzipped them and determined the code that differed. What made it easier was putting things in a table so you get logical containers for certain parts of code.
I would assume that you would use the above code and with the exception that the line
r_id = part.relate_to(url, RT.HYPERLINK, is_external=True)
would change to something like
r_id = part.relate_to(internal_tag, RT.HYPERLINK, is_external=False)
Then you would need to make an internal_tag for some other part of the document.
The workaround didn't work for me. I had to modify it to insert the hyperlink directly into the paragraph:
def add_hyperlink(paragraph, url, text):
"""
A function that places a hyperlink within a paragraph object.
:param paragraph: The paragraph we are adding the hyperlink to.
:param url: A string containing the required url
:param text: The text displayed for the url
:return: The hyperlink object
"""
# This gets access to the document.xml.rels file and gets a new relation id value
part = paragraph.part
r_id = part.relate_to(url, docx.opc.constants.RELATIONSHIP_TYPE.HYPERLINK, is_external=True)
# Create the w:hyperlink tag and add needed values
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
hyperlink.set(docx.oxml.shared.qn('r:id'), r_id, )
# Create a w:r element
new_run = docx.oxml.shared.OxmlElement('w:r')
# Create a new w:rPr element
rPr = docx.oxml.shared.OxmlElement('w:rPr')
# Join all the xml elements together add add the required text to the w:r element
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
paragraph._p.append(hyperlink)
return hyperlink
document = docx.Document()
p = document.add_paragraph()
add_hyperlink(p, 'http://www.google.com', 'Google')
document.save('demo.docx')
@rushton3179 can you elaborate more? eg. how can we create the internal_tag?
@johanvandegriff Your solution works fine for me. I just haven't mastered the skills needed to change color, font etc on the returned hyperlink. Can I get the function to return a 'run' instead so I can use run.style or run.underline?
@posterberg I don't know how to make a workaround that returns a run, but I have improved the current one to take the color and underline as arguments.
Here are the steps I took to change the text color, in case you need to add other properties:
word/
folder in the unzipped archive and open document.xml
<w:color w:val="FF8822"/>
inside the <w:rPr>
element. (Side note: rPr stands for "run Properties")rPr
. # Add color if it is given
if not color is None:
c = docx.oxml.shared.OxmlElement('w:color')
c.set(docx.oxml.shared.qn('w:val'), color)
rPr.append(c)
Here is the updated workaround with control of color and underlining:
import docx
def add_hyperlink(paragraph, url, text, color, underline):
"""
A function that places a hyperlink within a paragraph object.
:param paragraph: The paragraph we are adding the hyperlink to.
:param url: A string containing the required url
:param text: The text displayed for the url
:return: The hyperlink object
"""
# This gets access to the document.xml.rels file and gets a new relation id value
part = paragraph.part
r_id = part.relate_to(url, docx.opc.constants.RELATIONSHIP_TYPE.HYPERLINK, is_external=True)
# Create the w:hyperlink tag and add needed values
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
hyperlink.set(docx.oxml.shared.qn('r:id'), r_id, )
# Create a w:r element
new_run = docx.oxml.shared.OxmlElement('w:r')
# Create a new w:rPr element
rPr = docx.oxml.shared.OxmlElement('w:rPr')
# Add color if it is given
if not color is None:
c = docx.oxml.shared.OxmlElement('w:color')
c.set(docx.oxml.shared.qn('w:val'), color)
rPr.append(c)
# Remove underlining if it is requested
if not underline:
u = docx.oxml.shared.OxmlElement('w:u')
u.set(docx.oxml.shared.qn('w:val'), 'none')
rPr.append(u)
# Join all the xml elements together add add the required text to the w:r element
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
paragraph._p.append(hyperlink)
return hyperlink
document = docx.Document()
p = document.add_paragraph()
#add a hyperlink with the normal formatting (blue underline)
hyperlink = add_hyperlink(p, 'http://www.google.com', 'Google', None, True)
#add a hyperlink with a custom color and no underline
hyperlink = add_hyperlink(p, 'http://www.google.com', 'Google', 'FF8822', False)
document.save('demo.docx')
This function is the hyperlink equivalent of duct tape: It get the job done, but becomes harder to use when the complexity of the task increases.
Nice job @johanvandegriff :)
Just a note for anyone who doesn't know about it, opc-diag can be very handy for poking around inside .docx packages as an alternative to unzipping and reformatting the XML yourself. Also works for .xlsx and .pptx files.
@johanvandegriff Thank you so much!
How can I make the "inline_shape" as the hyperlink? Basically, I want an image as a hyperlink.
@scanny is anyone working on this? I was potentially going to pick it up this weekend and have a look at implementing it.
There was this pull request a while back but it stalled pretty early on: https://github.com/python-openxml/python-docx/pull/278
The comments on that PR should be good guidance. Best to start with the enhancement proposal (analysis document) so we can be sure we have the API sorted out. Can't change our mind about that later so it's best to get it right up-front. And implementing the wrong API isn't terrifically productive :)
I should add that most folks get stuck on the tests. If you're already a TDD guy these shouldn't be too surprising, but in any case you can usually find and adapt an existing example for both the acceptance tests and the unit tests. There's a lot of "repeating theme" going on in this particular application domain :)
If I create hyperlinks using the above workaround, what would the code to extract the URLs look like?
And indeed, having created a link like this
hyperlink = add_hyperlink(p, 'http://www.google.com', 'Google', '0000FF', True)
print p.text
You can no longer access the paragraph text. Is there a workaround?
see #85
@scanny would you be able to have a look over testing so far so I know I am on the right track? https://github.com/rushton3179/python-docx/tree/feature/hyperlink-tdd
Also I have been thinking and do we need methods to be able to collect hyperlinks from a paragraph? My instinct would be to leave it as a more simple class as it begins to bring doubt on how collecting runs from a paragraph may work. I feel it may be better to leave them and if the original Hyperlink objects are not held onto by the project then they are considered lost to be created again. The core functionality of the Paragraph class seems to be to create paragraphs and not to parse and process them.
@rushton3179 It's probably best to continue this as a pull request (PR). That way there's a segregated "space" where all the proposed changes are, along with any review conversation we might have. As you rebase and re-push the PR branch on your repo, it updates things in the PR. If you haven't done it before you might want to read up.
Anyway, it's pretty flexible, so it's not too early to get one going.
I've left you some comments on your branch, but let's continue from here in a PR :)
@johanvandegriff I'm able to get the color (using '0000EE' as the default blue hyperlink color) using your workaround but not the underlining. Interestingly enough, when I open my document in WordPad I get the color and and the underlining, but in Word 2016 I only get the former. Have you come across this at all? (I'm currently using a Word macro as an alternative.)
@dmitriy5 I have been using LibreOffice, so I don't know if it works in Word. You might want to add the underlining in Word, save it, and see how the xml has changed.
Underline was not working for me in word either using @johanvandegriff code. To have it underline by default, you need to add:
u = docx.oxml.shared.OxmlElement('w:u')
u.set(docx.oxml.shared.qn('w:val'), 'single')
rPr.append(u)
before you run new_run.append(rPr)
. You can also set 'single' to 'double' to double underline.
Hello, the @rushton3179 solution is working for a new paragraph like a charm. But it's possible to use this to insert an hyperlink into a table in a docx document??
Something like this
row.cells[1].paragraphs[0].text = add_hyperlink(...)
, because im getting a lot of errors when i do this is in a table.
Best regards
Hello, I had a look all over Google to find a way to add hyperlinks to my .docx files using python-docx. The only working solution i found was the code sample in this topic, bus as said @Naff16, it does not work in tables:
hdr_cells = table.rows[1].cells
p = hdr_cells[1].paragraphs[0]
add_hyperlink(p, ....)
the resulting .docx file is corrupted and can be restaured, but without the hyperlink... Any idea/help ? My need is to add hyperlinks to others files on the PC (the .docx would be too heavy if i add all pictures directly, so I prefer storethem in an other folder, and just add hyperlink to them) I know it works to add hyperlinks in the document, but I absolutely need to insert those hyperlinks in a table... Thanks
Hello ! After several tests, it appears that the "does not work in tables" is, in my case, because the table I tried to insert the hyperlink in was copied from another .docx file using deepcopy(), and somehow, this is a problem. So the solution I found is: 1) insert the table in the document
#cartouche is a table deepcopied form an other .docx
p = self.document.add_paragraph()
p._p.addnext(cartouche._tbl)
2)then, insert the link in the table, which is the last table inserted in document
for f in element.files:
p_table = self.document.tables[-1].rows[2].cells[1].add_paragraph()
file_name = f.split('/')[-1]
file_path = 'EVIDENCES/{}'.format(file_name)
# ajout du lien
add_hyperlink(p_table,file_name, file_path)
this worked just fine for me
Following on from @Adviser-ua comment:
How I can make hyperlink inside file to other paragraph ?
For anyone in this situation, i.e. wanting to link to an internal bookmark, this function, based on a stripped down version of @johanvandegriff code above worked for me (in Word 2010):
def add_hyperlink(paragraph, link_to, text, is_external):
''' Adds a hyperlink within a paragraph to an internal bookmark
or an external url '''
part = paragraph.part
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
if is_external:
r_id = part.relate_to(link_to,
docx.opc.constants.RELATIONSHIP_TYPE.HYPERLINK,
is_external= is_external)
hyperlink.set(docx.oxml.shared.qn('r:id'), r_id, )
else:
hyperlink.set(docx.oxml.shared.qn('w:anchor'), link_to, )
new_run = docx.oxml.shared.OxmlElement('w:r')
rPr = docx.oxml.shared.OxmlElement('w:rPr')
new_run.append(rPr)
new_run.text = text
hyperlink.append(new_run)
paragraph._p.append(hyperlink)
Set is_externa
l to False
and pass a bookmark to link_to
.
If you need to make a bookmark:
def add_bookmark(run, bookmark_name):
''' Adds a word bookmark to a run '''
tag = run._r
start = docx.oxml.shared.OxmlElement('w:bookmarkStart')
start.set(docx.oxml.ns.qn('w:id'), '0')
start.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(start)
text = docx.oxml.OxmlElement('w:r')
tag.append(text)
end = docx.oxml.shared.OxmlElement('w:bookmarkEnd')
end.set(docx.oxml.ns.qn('w:id'), '0')
end.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.append(end)
return run
One thing to note is that if the bookmark contains a space it causes a problem if the .docx is exported to PDF, i.e. it won't link in the exported PDF.
It bothered me that the text is not written into a normal run, but into an Element, so that font size and color are not preserved. I finally came up with this solution, that just adds a hyperlink to a normal run. The run parameter must of course be one of the runs of the paragraph. I confess that I have only a vague idea how lxml and docx work together. In the moment when hyperlink.append(run._r) is called, the run disappears from the runs, but the hyperlink is then inserted into runs where the run originally was.
def add_hyperlink_into_run(paragraph, run, url):
runs = paragraph.runs
for i in range(len(runs)):
if runs[i].text == run.text:
break
# This gets access to the document.xml.rels file and gets a new relation id value
part = paragraph.part
r_id = part.relate_to(url, docx.opc.constants.RELATIONSHIP_TYPE.HYPERLINK, is_external=True)
# Create the w:hyperlink tag and add needed values
hyperlink = docx.oxml.shared.OxmlElement('w:hyperlink')
hyperlink.set(docx.oxml.shared.qn('r:id'), r_id, )
hyperlink.append(run._r)
paragraph._p.insert(i+1,hyperlink)
Hi! How to add a hyperlink to an internal heading paragraph?
It's work for me (Libre), with a few changes. Thanks @neilbilly !
def add_bookmark(run, bookmark_name):
''' Adds a word bookmark to a run '''
tag = run._r
start = docx.oxml.shared.OxmlElement('w:bookmarkStart')
start.set(docx.oxml.ns.qn('w:id'), '0')
start.set(docx.oxml.ns.qn('w:name'), bookmark_name)
tag.addprevious(start)
text = docx.oxml.OxmlElement('w:r')
tag.append(text)
end = docx.oxml.shared.OxmlElement('w:bookmarkEnd')
end.set(docx.oxml.ns.qn('w:id'), '0')
tag.addnext(end)
return run
As for a common case, I have a text like """I am trying to add an hyperlink in a MS Word document using docx module for \<a href="python.org">Python\</a>. Just do it.""", and keyword for "Python", link for "python.org". Just add a function based on @johanvandegriff ,
def is_text_link(text):
for i in ['http', '://', 'www.', '.com', '.org', '.cn', '.xyz', '.htm']:
if i in text:
return True
else:
return False
def add_text_link(document, text):
paragraph = document.add_paragraph()
text = re.split(r'<a href="|">|</a>',text)
keyword = None
for i in range(len(text)):
if not is_text_link(text[i]):
if text[i] != keyword:
paragraph.add_run(text[i])
elif i + 1<len(text):
url=text[i]
keyword=text[i + 1]
add_hyperlink(paragraph, url, keyword, None, True)
document.save('test.docx')
p_table = self.document.tables[-1].rows[2].cells[1].add_paragraph()
Thank you for this!
This is the bit I needed to properly reference the paragraph so I could insert a hyperlink in a cell. All working now.
We also got it running with @johanvandegriff 's solution, thanks! Once the feature is shipped then we'll move to the official solution :) thanks guys
I thank you all for the works trying to improve this wonderful project. Sorry I have not a real understanding of how all these implementations work, but please take this comment into account before merging code into an official solution.
I tried many of these code samples trying to add links to a document, using both examples given here and in StackOverflow.
Although many of them worked, in the sense that hyperlinks do appear when I open the docx file in Word, ... there is still something which must be different to the standard .docx way of hyperlinking.
I say this because when I upload these docx files to Google Drive in order to share them ... the hyperlinks get lost after conversion to Google Docs format (which I do because this format does not consume my Drive quota). This does NOT happen to hyperlinks in an "standard" .docx file created with MS Word (they still remain when you convert them to Google Doc format).
This might seem irrelevant to many of you, but I think it reveals some error in the way hyperlinks are being created. It could affect to future conversions/compatibility of your files (I just tried Google Docs but there might be other conversions which are already failing).
Fortunately, I found one implementation in this thread (thanks @michaelu123) where hyperlinks are not being lost. There is a similar implementation by @brasky in #610 too.
So @johanvandegriff @scanny @tanyunshi @robertdodd @ryan-rushton ... please take a look at @michaelu123 code before making a final version.
Thanks a lot again to all of you!!
I'm trying to add a hyperlink to my table inside of one of the cells, but when I use this method it messes up the spacing of the column. The hyperlink isn't wrapped around in the cell like I want it to be.
Edit: Oh wait nevermind, my table had "automatically resize to fit contents". I wasn't having this issue until I added the hyperlink weirdly enough.. to fix it you add table.autofit = False
how to add file logo in the place of hyperlink
Protocol might be something like this:
XML specimen: