meridius / confluence-to-markdown

Confluence to Markdown converter which is actually working
MIT License
137 stars 50 forks source link

Confluence to Markdown converter which is actually working

Convert Confluence HTML export to Markdown

Requirements

You must have pandoc command line tool installed. Check it by running:

pandoc --version

Install all project dependencies:

npm install

Usage

In the converter's directory:

npm run start <pathResource> <pathResult>

Parameters

parameter description
<pathResource> File or directory to convert with extracted Confluence export
<pathResult> Directory to where the output will be generated to. Defaults to current working directory

Process description

Room for improvement

If you happen to find something not to your liking, you are welcome to send a PR. Some good starting points are mentioned in the Process description section above.

Export to HTML

Note that if the converter does not know how to handle a style, HTML to Markdown typically just leaves the HTML untouched (Markdown does allow for HTML tags).

Step by step guide for Confluence data export

  1. Go to the space and choose Space tools > Content Tools on the sidebar.
  2. Choose Export. This option will only be visible if you have the Export Space permission.
  3. Select HTML then choose Next.
  4. Decide whether you need to customize the export:
    • Select Normal Export to produce an HTML file containing all the pages that you have permission to view.
    • Select Custom Export if you want to export a subset of pages, or to exclude comments from the export.
  5. Extract zip

WARNING
Please note that Blog will NOT be exported to HTML. You have to copy it manually or export it to XML or PDF. But those format cannot be processed by this utility.

Attribution

Thanks to Eric White for a starting point.