dernorberto / confluenceDumpWithPython

Download Confluence pages including attachments and emoticons using Atlassian API and Python
MIT License
14 stars 6 forks source link

Non-image Attachments are dowloaded but the link is not updated #4

Closed dernorberto closed 1 year ago

dernorberto commented 1 year ago

example pageID: 716701697

HTML: <a href="https://optile.atlassian.net/wiki/download/attachments/716701697/File.docx?version=1&amp;modificationDate=1540461757616&amp;cacheVersion=1&amp;api=v2" rel="nofollow"> translates to RST: -File.docx https://optile.atlassian.net/wiki/download/attachments/716701697/File.docx?version=1&modificationDate=1540461757616&cacheVersion=1&api=v2__ (attached)

cause:

dernorberto commented 1 year ago

fixed by changing dumpHtml function: soup.findAll('img',class_="confluence-embedded-image") to soup.findAll('img',class_=re.compile("^confluence-embedded-image")) this now includes all embeds starting with that class