Unable to test upload of Antora component

cpkio commented 7 months ago

I launched local Confluence test server.

A lot of errors pointing to links with anchors.

…
error:    Error reading file …\AntoraSampleRepo\build\site\Service\index.html#key
…

And in the end:

[07:36:24.671] FATAL (antora): Cannot read properties of undefined (reading 'number')
    Cause: TypeError
        at createState (…\AntoraSampleRepo\node_modules\antora-confluence\dist\lib\service\StateService.js:52:41)
        at processTicksAndRejections (node:internal/process/task_queues:96:5)
        at async publishToConfluence (…\AntoraSampleRepo\node_modules\antora-confluence\dist\index.js:46:9)
        at async Promise.all (index 1)
        at async generateSite (…\AntoraSampleRepo\node_modules\@antora\site-generator\lib\generate-site.js:53:12)
        at async Command.parseAsync (…\AntoraSampleRepo\node_modules\commander\lib\command.js:935:5)

PacoVK commented 7 months ago

Thanks for reporting. The last error is related to the Confluence state which is a page in Draft mode. @cpkio is it possible to provide your sample as input, including the container definition? That would make investigation easier. So far I developed against a Confluence cloud instance.

cpkio commented 7 months ago

@PacoVK See https://github.com/cpkio/antora-demo-component and /tests folder, where is a JSON of GET https://<server>/rest/api/content response is, hope it helps with debugging.

Playbook:

output:
  clean: true
  destinations:
  - provider: fs
  - provider: antora-confluence
    confluence-api: https://192.168.56.102:8443
    confluence-space: MY

What do you mean by «container definition»?

cpkio commented 7 months ago

this.fetch(${this.BASE_URL}/${this.API_V1_PATH}/content) in ConfluenceClientV1.js:52 will result in https://192.168.56.102:8443//rest/api/content if API_DEFAULT_CONTEXT = "". And there's no /wiki in fresh CF installation, so it has to be "". When URL fixed, pages are posted successfully.

I think H1 should be stripped from posted pages, otherwise we have two same headers on a page.

CDATA processing seems to have a bug. Original <p>:

<p>Можно создать в <a href="#RandomService:ROOT:page$index.adoc#project" class="xref unresolved">проекте</a>
дополнительный <a href="#RandomService:ROOT:page$index.adoc#domain" class="xref unresolved">домен</a>
с заданным именем.</p>

results in

<ac:plain-text-link-body><![CDATA[ домен]]></ac:plain-text-link-body></ac:link>с заданным именем.
                                  ^                                            ^

when it should be

<ac:plain-text-link-body><![CDATA[домен]]></ac:plain-text-link-body></ac:link> с заданным именем.
                                  ^                                           ^

PacoVK commented 7 months ago

Thanks again for investigation, i really appreciate this! There are a couple of things now:

The API url - Confluence Cloud vs. Confluence on-prem. The later has the option to set a context, but is not required. The cloud version always has a context. Since Confluence on-prem seems to fade-out Confluence server. There also seems an equivalent, but i could not yet have a look into it. However, as way forward, i think i will update the docs to help user not running into the same issue as described here. If you are running Confluence on-prem without context, you could set confluenceApi to https://confluence.example.com/rest/api (the full API baseURL). Captain is then able to unclutter the paths and detects that there is no context mandatory.
Stripping <h1> i think that is a good idea, but it has drawbacks. Confluence page titles need to be unique across a space. If you have several pages with the same h1 in an Antora setup (which is valid) You'll get a conflict in Confluence. In that case captain detects that and adds prefixes to the page titles. In order to preserve the original title, i did not yet just strip h1 from the content. WDYT, does it make sense to you? I am open for input.
True, this is a bug. I will take care an patch it soon :)

cpkio commented 7 months ago

I was just going to ask you about how do you solve this problem with Confluence, when there are multiple same-title pages. There can be 6 same-titled pages for example, will there be 5 _ before one of the titles? I think extra chars should be appended, not prepended, to title. Maybe some rare Unicode spaces?

Deleting H1 after parsing the titles is OK. Double headers just confuse users. Readers ideally should not know if this content was uploaded from Antora, it should be a normally looking page.

It came to my mind that to get valid links to other pages you have to load them while parsing current document and get it's title. How do you solve this?

PacoVK commented 6 months ago

Well, i decided to use the following pattern for duplicate titles: -- using this approach allows Captain to construct valid links to other pages... Captain loads the whole pagetree, hence at page construction time, i already know all links that may need to be modified.

PacoVK / antora-confluence

Unable to test upload of Antora component #11