souvikinator / notion-to-md

Convert notion pages, block and list of blocks to markdown (supports nesting and custom parsing)
https://www.npmjs.com/package/notion-to-md
MIT License
1.11k stars 91 forks source link

Synced blocks wrongly produce indented Markdown output #43

Closed smor closed 2 years ago

smor commented 2 years ago

Hello,

Thanks for the good work, this package is very useful to me !

General context

I'm trying to build a Notion to Hugo process using notion-to-md, which I simply called notion-to-hugo. I want to use it to build online courses in the most generic way I can, allowing content creators to leverage the full potential of Notion while ensuring that the final website works in our course setup.

I want to be able to customize the rendering process to be able to adapt the generated content for the particular Hugo settings that I use, including shortcodes for instance.

I use Notion Synced Block to reuse content across pages.

I started to work on a generic pre/post-processing pipeline here. I was very happy to see setCustomTransformer emerging, as I like the direction it's heading !

The issue

You can see that in this Notion page there is a Notion synced block. When processed with notion-to-md, the Markdown MdBlock output is the following :

    This block is shared with other pages and editing one changes all other instances.

    > 💡 This is a callout which turns into a `<div class="notices note"></div>` block.

As you can see, the content is indented, leading to the paragraph being converted as a code block by Hugo, which you can see here on the generated HTML page produced by notion-to-hugo through notion-to-md.

Analysis

The block structure produced by notion-to-md is the following :

  {
    "type": "synced_block",
    "parent": "",
    "children": [
      {
        "type": "paragraph",
        "parent": "This block is shared with other pages and editing one changes all other instances.",
        "children": []
      },
      {
        "type": "callout",
        "parent": "> 💡 This is a callout which turns into a `<div class=\"notices note\"></div>` block.",
        "children": []
      }
    ]
  }

We can see that the actual content blocks are children of the synced block. The synced block is only a container for actual content, and I don't see any reason why we should treat its children as real children.

The indent is made here :

      if (mdBlocks.parent) {
        if (
          mdBlocks.type !== "to_do" &&
          mdBlocks.type !== "bulleted_list_item" &&
          mdBlocks.type !== "numbered_list_item"
        ) {
          // add extra line breaks non list blocks
          mdString += `\n${md.addTabSpace(mdBlocks.parent, nestingLevel)}\n\n`;
        } else {
          mdString += `${md.addTabSpace(mdBlocks.parent, nestingLevel)}\n`;
        }
      }

My guess is that we should not addTabSpace to children of blocked whose type is synced_block.

Attempts at solving the issue

I tried the following things :

I'm feeling stuck, probably by my lack of understanding of some of the processing flows... Would you be able to help me sort it out ? I hope I provided enough information for that.

Thanks a lot !

souvikinator commented 2 years ago

Hi there. Thanks for reporting the issue and the PR #44 . For the time your PR will fix the issue so I'll merge it, however I believe this may be the issue with other blocks as well so I'll release a fix in coming days.

smor commented 2 years ago

Hi,

Thanks for the quick answer and merge ! I think that this would need to be investigated further as well.

Best regards

souvikinator commented 2 years ago

The fix is now live in v2.5.3. Thanks for the contribution :)

souvikinator commented 2 years ago

Hi,

Thanks for the quick answer and merge ! I think that this would need to be investigated further as well.

Best regards

yes, I'm on it.