Open ricardoxu6 opened 1 month ago
Hello @ricardoxu6! I'm here to help you with any bugs, questions, or contributions you may have. Let's tackle this ConfluenceReader base url issue together.
Yes, you can use ConfluenceReader
with a base URL that is hosted on a custom domain. The base_url
parameter in the ConfluenceReader
class is designed to accept any valid Confluence instance URL, including those hosted on custom domains.
Here is the relevant part of the code that shows how the base_url
is used:
class ConfluenceReader(BaseReader):
"""Confluence reader.
Reads a set of confluence pages given a space key and optionally a list of page ids
Args:
oauth2 (dict): Atlassian OAuth 2.0, minimum fields are `client_id` and `token`, where `token` is a dict and must at least contain "access_token" and "token_type".
base_url (str): 'base_url' for confluence cloud instance, this is suffixed with '/wiki', eg 'https://yoursite.atlassian.com/wiki'
cloud (bool): connecting to Confluence Cloud or self-hosted instance
"""
def __init__(
self, base_url: str = None, oauth2: Optional[Dict] = None, cloud: bool = True
) -> None:
if base_url is None:
raise ValueError("Must provide `base_url`")
self.base_url = base_url
try:
from atlassian import Confluence
except ImportError:
raise ImportError(
"`atlassian` package not found, please run `pip install"
" atlassian-python-api`"
)
self.confluence: Confluence = None
if oauth2:
self.confluence = Confluence(url=base_url, oauth2=oauth2, cloud=cloud)
else:
api_token = os.getenv(CONFLUENCE_API_TOKEN)
if api_token is not None:
self.confluence = Confluence(url=base_url, token=api_token, cloud=cloud)
else:
user_name = os.getenv(CONFLUENCE_USERNAME)
if user_name is None:
raise ValueError(
"Must set environment variable `CONFLUENCE_USERNAME` if oauth,"
" oauth2, or `CONFLUENCE_API_TOKEN` are not provided."
)
password = os.getenv(CONFLUENCE_PASSWORD)
if password is None:
raise ValueError(
"Must set environment variable `CONFLUENCE_PASSWORD` if oauth,"
" oauth2, or `CONFLUENCE_API_TOKEN` are not provided."
)
self.confluence = Confluence(
url=base_url, username=user_name, password=password, cloud=cloud
)
As long as you provide a valid base_url
for your custom domain, the ConfluenceReader
should work correctly [1].
To continue the conversation, mention @dosu.
What if for my custom domain, those confluence pages' URLs do not have a page_id included in the url, can I still use the Confluence Reader
Yes, you can use the ConfluenceReader
class to read Confluence pages on a custom domain even if the URLs do not include a page_id
. The load_data
method allows you to specify Confluence pages using one of four mutually exclusive methods: space_key
, page_ids
, label
, or cql
(Confluence Query Language). Therefore, you can use space_key
, label
, or cql
to read pages without needing to specify page_ids
[1].
To continue the conversation, mention @dosu.
Question Validation
Question
token = {"access_token": "", "token_type": ""}
oauth2_dict = {"client_id": "", "token": token}
base_url = "https://yoursite.atlassian.com/wiki"
To use Confluence Reader to read pages on Confluence, it needs us to specify a base URL and the URL should be in the typical format of Atlassian Cloud, which ends with /wiki. So, if my Confluence is hosted on a custom domain, for instance, https://confluence.xxx.com/, is there any other way for us to use Confluence Reader?