NotionX / react-notion-x

Fast and accurate React renderer for Notion. TS batteries included. ⚡️
https://react-notion-x-demo.transitivebullsh.it
MIT License
4.85k stars 568 forks source link

Bug: getCanonicalPageId does not support non-latin page titles #422

Closed marharyta closed 1 year ago

marharyta commented 1 year ago

Description

I, unfortunately, failed to create a PR, the repo seems to require permission to push a new branch for me. But nevertheless, here is the problem description and my proposed solution:

Screenshot 2023-01-21 at 18 40 14

Problem: getCanonicalPageId does not support non-latin page titles

Issue:

I am using [Notion.so](http://notion.so/) to run [FinUA.org](http://finua.org/) website and currently it isb deployed with [super.so](http://super.so/). I have been using nextjs-notion-starter-kit project for it (thank you).

As deployed the project to Vercel, I realized that there were quite a few browser warnings about the page due to generated page URLs (they looked broken).

Screenshot 2023-01-21 at 18 41 25

the page behind it:

Screenshot 2023-01-21 at 18 41 37

moreover, this page also had the same URL generated /- despite being a separate page, and clicking on it would lead to the first page.

Screenshot 2023-01-21 at 18 42 13

I have investigated it, and it seems that the problem was in the module https://github.com/transitive-bullshit/nextjs-notion-starter-kit/blob/main/lib/get-canonical-page-id.ts

import { ExtendedRecordMap } from 'notion-types'
import {
  getCanonicalPageId as getCanonicalPageIdImpl,
  parsePageId
} from 'notion-utils'

import { inversePageUrlOverrides } from './config'

export function getCanonicalPageId(
  pageId: string,
  recordMap: ExtendedRecordMap,
  { uuid = true }: { uuid?: boolean } = {}
): string | null {
  const cleanPageId = parsePageId(pageId, { uuid: false })
  if (!cleanPageId) {
    return null
  }

  const override = inversePageUrlOverrides[cleanPageId]
  if (override) {
    return override
  } else {
        // PROBLEM: this line seemed to be the issue
    return getCanonicalPageIdImpl(pageId, recordMap, {
      uuid
    })
  }
}

I went to the module https://github.com/NotionX/react-notion-x/tree/master/packages/notion-utils

and copied https://github.com/NotionX/react-notion-x/blob/master/packages/notion-utils/src/get-canonical-page-id.ts module, the problem seemed to be getCanonicalPageId function, it only seemed to work for Latin symbols normalizeTitle(getBlockTitle(block, recordMap)):

I pulled the normalizeTitle function, and yes, it seems to be the case

function normalizeTitle(title) {
  return (title || '')
    .replace(/ /g, '-')
    .replace(/[^a-zA-Z0-9-\u4e00-\u9fa5]/g, '')
    .replace(/--/g, '-')
    .replace(/-$/, '')
    .replace(/^-/, '')
    .trim()
    .toLowerCase()
}

const eng = normalizeTitle('Naapurin Maalaiskana (NMK), in Lieto, in Turku area');
const ukr = normalizeTitle('Робота помічника з обслуговування контейнерів');
const ukr1 = normalizeTitle('Ищем литейщиков в Карккила, Финляндия, для обработки изделий в металлургической промышленности');
console.log('test', eng, ukr, ukr1)

// "test"
// "naapurin-maalaiskana-nmk-in-lieto-in-turku-area"
// ""
// "---"

Solution:

The one that worked for me was just replacing normalizeTitle(getBlockTitle(block, recordMap)) with slugify from the transliteration npm package.

Notion Test Page ID

701245d6db8c413689d180e87269ee56

marharyta commented 1 year ago

Created a PR, https://github.com/NotionX/react-notion-x/pull/423 , closing the issue