souvikinator / notion-to-md

Convert notion pages, block and list of blocks to markdown (supports nesting and custom parsing)
https://www.npmjs.com/package/notion-to-md
MIT License
1.08k stars 89 forks source link

New document for each page? #71

Closed kerim closed 1 year ago

kerim commented 1 year ago

I finally got this working (thanks for your help!) but now I've discovered that it tries to put everything into a single markdown file. Is there a way to get it to create a new markdown file for each child page?

souvikinator commented 1 year ago

At the moment there is no such thing however this could be a feature in the upcoming release. By default, it'll merge everything into one file. To perform the above you'll have to pass an option to notion-to-md that saves child page data to a new file.

If you have something in mind feel free to share

Radiergummi commented 1 year ago

I just hit on the same thing - trying to write all pages below a root page into a directory that directly mirrors the structure on Notion. Is there a simple way to discern blocks by page?

souvikinator commented 1 year ago

At the moment the way it handles the child page is very different thereby making it difficult to save the child pages to a separate file. This definitely is an interesting use case and I'm working to change the way it deals with the child pages so it may take some time.

Any contribution is appreciated.

Radiergummi commented 1 year ago

I solved this just now by doing the following:

async function renderPage(id: string) {
  const page = await notion.pages.retrieve({ page_id: id });
  const blocks = await n2m.pageToMarkdown(page.id);

  // Simple utility function to partition by predicate, 
  // sorting blocks into children pages and everything else.
  // See here for my implementation:
  // https://gist.github.com/Radiergummi/ccc83114df365bb5bfd0db619fe8e056
  const [ childBlocks, markdownBlocks ] = partition(
    blocks,
    (block): block is Block => block.type === 'child_page',
  );

  // ... Process pages here ...

  const content = n2m.toMarkdownString( markdownBlocks );
  const children = await Promise.all(childBlocks.map(
    async child => renderPage(child.blockId),
  ));

  return { content, children };

This works by recursively processing pages, partitioning the blocks of a page into child pages and content blocks, then converting only the content blocks to markdown and recursively running the child page blocks through the render function again.

I'm sure that could be somehow abstracted away at the library level, although I don't have a neat suggestion at hand right now. I hope this helps a bit, though.

souvikinator commented 1 year ago

Does this work when the child pages are in a column block?

Radiergummi commented 1 year ago

Does this work when the child pages are in a column block?

~Haven't tried that yet, let me check~

@souvikinator no, it doesn't. Dang. Going to take a look at the responses and see what I can do to fix this.

souvikinator commented 1 year ago

So that is where the issue is arising. Before the v2.5.6 release column block handler was buggy. So for the latest version, the solution was more of a workaround and handles most of the cases except this one. There are similar cases with synced blocks and blocks that can be treated as collections of blocks so we need to handle them separately.

souvikinator commented 1 year ago

This feature should be live in the upcoming release. Feel free to reopen the issue. Thank you for your contribution :))