kjk / notionapi

Unofficial Go API for Notion.so
https://blog.kowalczyk.info/article/c9df78cbeaae4e0cb2848c9964bcfc94/using-notion-api-go-client.html
BSD 2-Clause "Simplified" License
1.82k stars 86 forks source link

Exporting markdown fails #55

Open tkrajina opened 1 year ago

tkrajina commented 1 year ago

Hi @kjk

We're using your excellent library for one reason -- it has the option of exporting pages with markdown. The official API has no markdown exporting.

Now, this used to work with your notionapi, but a few weeks ago they changed something on their backend and it now fails.

We're using client.ExportPages() which internall use enqueueTask and getTasks and it then returns the download url.

Example code to reproduce the problem:

    client := &notionapi.Client{}

    client.AuthToken = "..."
    client.DebugLog = true

    url, err := client.RequestPageExportURL("...", notionapi.ExportTypeMarkdown, false)
    panicIfErr(err)
    fmt.Println(url)

    res, err := client.DownloadURL(url)
    panicIfErr(err)

    fmt.Println("res:", res.Data)

...and it errors in client.DownloadURL().

We tried to inspect the browser to see why downloading the file there works, and this is the final "download file" request (stripped of all the other unneeded headers):

curl 'https://file.notion.so/....zip?id=...&table=user_export&expirationTimestamp=...&signature=...&download=true&downloadName=....zip' \
  -H 'cookie: file_token=...;' \
  --compressed

So, it looks like, they now require a file_token cookie with file.notion.so. It is not enough to have the download URL.

Any idea how the file_token is retrieved/calculated?

kjk commented 1 year ago

Sorry, no idea. I currently don't have time to investigate this so you're on your own.

I'm guessing file_token is returned by either existing API (and it's not reflected in Go structs because it wasn't there when I wrote the code initially) or there's another API they added to get it.

Be happy to merge a PR if you figure it out.

As far as figuring it out: you can follow the same process as I did originally https://blog.kowalczyk.info/article/88aee8f43620471aa9dbcad28368174c/how-i-reverse-engineered-notion-api.html

Basically: invoke the action from the browser and see what API calls the browser makes.

Write a pupeeter script to record API calls for easier analysis.

tkrajina commented 1 year ago

Unfortunately, that's exactly what I did (check the browser logs) and file_token isn't in the API responses. I think it's somehow calculated in the (obfuscated) javascript. Anyway, I'll keep investigating, thank you for your work anyway.

nisanthchunduru commented 6 months ago

@tkrajina Did you resolve this problem?

nisanthchunduru commented 6 months ago

Found this alternative to Notion's undocumented Export to Markdown API

kjkNotionApiClient := &notionapi.Client{
        AuthToken: tokenV2CookieString,
}
childPage, err := kjkNotionApiClient.DownloadPage(childPageId)
if err != nil {
    printErrorAndExit(err)
}
markdown := tomarkdown.NewConverter(childPage).ToMarkdown()
fmt.Println(string(markdown))
nisanthchunduru commented 6 months ago

It case it helps anyone, I authored a Go package to export any Notion page to Markdown https://github.com/nisanthchunduru/notion2markdown It accepts a Notion integration token instead of Notion's token v2 cookie string