souvikinator / notion-to-md

Convert notion pages, block and list of blocks to markdown (supports nesting and custom parsing)
https://www.npmjs.com/package/notion-to-md
MIT License
1.08k stars 89 forks source link

Performance Issue: Taking excessive time to convert Blocks to Markdown #107

Open tictaqqn opened 3 months ago

tictaqqn commented 3 months ago

Description

I've encountered a performance issue with the notion-to-md package where the conversion of Notion pages to Markdown takes an excessively long time.

Steps to Reproduce

Run below. saveBlocksToMarkdown ends within 5 seconds, but saveBlocksToFile takes more than 10 minutes.

import 'dotenv/config'
import { promises as fs } from 'fs';

import { Client } from '@notionhq/client';
import { NotionToMarkdown } from 'notion-to-md';

// 環境変数からNotion APIのトークンとページIDを取得
const notionToken = process.env.NOTION_TOKEN;
const pageId = process.env.PAGE_ID ?? '';  // UUID形式のページID

const notion = new Client({
    auth: notionToken
});

const n2m = new NotionToMarkdown({ notionClient: notion });

async function fetchAllBlocks(blockId: string, startCursor?: string) {
    let blocks: unknown[] = [];
    let hasMore = true;
    let cursor = startCursor;

    while (hasMore) {
        const response = await notion.blocks.children.list({
            block_id: blockId,
            start_cursor: cursor,
            page_size: 100
        });
        blocks = blocks.concat(response.results);
        hasMore = response.has_more;
        cursor = response.next_cursor ?? undefined;
    }

    return blocks;
}

async function saveBlocksToMarkdown(blocks: unknown[], filename: string) {
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
  const markdown = await n2m.blocksToMarkdown(blocks as any[])
  await fs.writeFile(filename, markdown, 'utf-8');
  console.log(`Markdown saved to ${filename}`);
}

async function saveBlocksToFile(blocks: unknown[], filename: string) {
    const jsonContent = JSON.stringify(blocks, null, 2);
    await fs.writeFile(filename, jsonContent, 'utf-8');
    console.log(`Blocks saved to ${filename}`);
}

async function run() {
    try {
        const blocks = await fetchAllBlocks(pageId);
        await saveBlocksToFile(blocks, './outputs/notion_blocks.json');
        await saveBlocksToMarkdown(blocks, './outputs/notion_blocks.md')
    } catch (error) {
        console.error('Error retrieving or saving Notion blocks:', error);
    }
}

run();

Expected Behavior

The conversion process should complete in a reasonable amount of time, proportional to the complexity and size of the Notion page being converted.

Actual Behavior

The conversion process takes an unusually long time, far exceeding reasonable expectations with about 10,000 blocks.

Possible Solution

I am not sure what might be causing this issue, but it might be related to how data is fetched or processed during the conversion. A review of the fetching and parsing mechanisms might be needed.

Additional Context