atlassian-api / atlassian-python-api

Atlassian Python REST API wrapper
https://atlassian-python-api.readthedocs.io
Apache License 2.0
1.37k stars 664 forks source link

Reading large confluence page #1416

Open hemaswapnika1 opened 5 months ago

hemaswapnika1 commented 5 months ago

confluence.get_page_by_id(page_id, expand='body.storage') is not reading all the data if the confluence page is very large. (1500 KB) How to handle reading of such large confluence pages?

gonchik commented 2 months ago

Hi @hemaswapnika1 Just extend timeout during initializing that's enough

eoinmarron commented 1 month ago

I've bumped timeout up from the 75 (seconds presumably...) default in version 3.41.16 to 150 and I'm not getting any more of the page returned in my case.

I've also noticed the same behaviour with the Confluence API itself leading me to think this isn't necessarily a problem with the atlassian-python-api.

eoinmarron commented 1 month ago

On further exploration with the .get_page_as_pdf(page-id) method, I was able to observe the whole page contents writing to PDF. I then tried to write the .get_page_by_id(page_id) output to file and it worked (whole page contents in file). This then points to PyCharm as the problem rendering the full page when in debugger mode.

system: Python 3.10.4 Pycharm 2024.2.1 atlassian-python-api 3.41.16

working code for me:

from atlassian import Confluence
conf = Confluence(
    url="https://confluence.foobar.com/",
    username="svc-crops-okta-confluence",
    token="foobar",
)
page_id = conf.get_page_id(
    "foobar",
    "foobar_page"
)
page_content = conf.get_page_by_id(page_id, expand="body.view")
html_body = page_content["body"]["view"]["value"]
with open("test_file.txt", "w") as stream:
    stream.write(html_body)