mixmark-io / turndown

🛏 An HTML to Markdown converter written in JavaScript
https://mixmark-io.github.io/turndown
MIT License
8.52k stars 864 forks source link

Is it possible to use turndown in Cloudflare Workers? #469

Closed Puliczek closed 1 month ago

Puliczek commented 1 month ago

I was wondering if it is possible to use the Turndown library within Cloudflare Workers ?

Cloudflare Workers doesn't support the DOM API, and we cannot access DOMParser. I am getting the error "document is not defined". There is an HTMLRewriter that could potentially help, but I'm not sure how to implement it with Turndown.

My code:

let markdown = "";
const turndownService = new TurndownService({
      hr: '---',
      codeBlockStyle: 'fenced'
    });
markdown = turndownService.turndown("<h1>Test</h1>");

Has anyone successfully implemented this, and are there any specific steps or modifications needed to make it work within the Workers environment?

Thanks in advance for your help!

pavelhoral commented 1 month ago

Don't use browser based library for server workers. Use NodeJS dependency which uses @mixmark-io/domino parser... or use that parser directly.

Puliczek commented 1 month ago

@pavelhoral thanks for help.

The problem is that Cloudflare does not support the require function by default. I prefer not to add Webpack to my project. Here is the code I am using and it works:

Install: npm install @mixmark-io/domino

import { createDocument } from "@mixmark-io/domino";
import TurndownService from "turndown";

const html = createDocument("<h1>asdasd</h1>");
const turndownService = new TurndownService({
    hr: "---",
    codeBlockStyle: "fenced",
});

markdown = turndownService.turndown(html);

to remove types error with "@mixmark-io/domino" I have added the following file to my code

types\domino.d.ts

declare module "@mixmark-io/domino" {
  export function createDocument(html: string): Document;
}

@pavelhoral It works, but is this an acceptable solution? While it may not be the best approach, I am curious if it might lead to any errors or performance issues.

pavelhoral commented 1 month ago

It feels like an acceptable solution. Maybe there are other solutions that I am not seeing (I am not familiar with Cloudflare Workers).