oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
73.31k stars 2.69k forks source link

Bun is 20x slower with "marked" npm library compared to Node #3464

Open p0358 opened 1 year ago

p0358 commented 1 year ago

What version of Bun is running?

0.6.11

What platform is your computer?

WSL | Microsoft Windows NT 10.0.19045.0 x64 | Linux 5.15.90.1-microsoft-standard-WSL2 x86_64 unknown

What steps can reproduce the bug?

I posted steps to reproduce the issue over there (includes screenshot from WebKit's profiler too): https://github.com/markedjs/marked/issues/2863

But TL;DR is that with this code:

import * as fs from "fs";
import { marked } from "marked";

const renderer = new marked.Renderer(); // these two lines below seem to make no difference though
renderer.link = (href, title, text) => `<a target="_external" onclick="return window.onLinkClick(this)" href="${ href }" ${title? 'title="'+title+'"' : ''}>${ text }</a>`;
renderer.heading = (text, level) => `${level === 1 ? '</div><div class="changelogEntry">' : ''}<h${level}>${text}</h${level}>`;
const data = fs.readFileSync("./test.md").toString();

marked(data, { renderer }, (error, parseResult) => {
    if (error)
        throw error;
    console.log(parseResult.length);
});

and this markdown file: test.md

it takes 1.5s to run with Bun, and 0.07s to run with Node.

Worth to mention that it might be JavaScriptCore vs V8 issue, I was also able to reproduce it with Ultralight (embedded WebKit for programs), and there the issue was even more prevalent.

What is the expected behavior?

The code should have relatively similar performance in JSC vs V8, like other Markdown parsers

What do you see instead?

Very bad unexpected performance, 20x slower, taking seconds instead of miliseconds

Additional information

I know this is generelly an issue with Marked and JavaScriptCore rather than Bun itself. But I noticed that Bun tracks in the issue section a lot of issues with popular external libraries and their breakage in Bun vs Node, the goal being the same code as Node runs, also running on Bun. So I thought that the team here perhaps, given their experience with JSC, may want to take a peek at profiling this to find out where's the issue...

Jarred-Sumner commented 1 year ago

This is a case where it is 100% on Bun to address -- not the fault of JavaScriptCore or the library.

p0358 commented 1 year ago

Fair, I respect this approach. Just wanted to point out that it also happens in full WebKit-based browser too, in case it matters (I feel like some changes could possibly need to be made in either JSC or the library, I'm somewhat curious what could cause it to slow down this much)

Jarred-Sumner commented 1 year ago

That being said...looks like the Regex implementation in JSC is the cause.

image
Jarred-Sumner commented 1 year ago

https://bugs.webkit.org/show_bug.cgi?id=258706

dylang commented 12 months ago

Hi! I don't suppose it's possible to use https://github.com/google/re2 as a replacement to JSC's regex engine until the performance problems can be addressed?

yschroe commented 11 months ago

I'm not sure if it is the same underlying issue, but I've ran into similar problems when combining capturing groups with quantifiers in RegExps. See following small bench script:

test.mjs

import { run, bench } from 'mitata';
const TEST_STRING =
  'Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';

const REGEXES = [
  /A{1,3}B{1,3}/g,
  /A{1,3}(B){1,3}/g,
  /(A)(B){1,3}/g,
  /(A){1,3}(B){1,3}/g
];

for (const regex of REGEXES) {
  bench(String(regex), () => {
      TEST_STRING.match(regex);
  });
}

await run();

The last regexp is ~23x slower on bun compared to node.js:

cpu: AMD Ryzen 5 2600X Six-Core Processor

runtime: node v20.6.1 (x64-linux)
benchmark                time (avg)             (min … max)       p75       p99      p995
----------------------------------------------------------- -----------------------------
/A{1,3}B{1,3}/g      184.57 ns/iter (158.55 ns … 262.88 ns) 193.61 ns 257.91 ns 258.08 ns
/A{1,3}(B){1,3}/g    309.29 ns/iter (244.18 ns … 463.61 ns) 349.64 ns 459.74 ns 463.61 ns
/(A)(B){1,3}/g       273.13 ns/iter (244.92 ns … 465.79 ns)  279.6 ns 409.03 ns 465.79 ns
/(A){1,3}(B){1,3}/g  276.53 ns/iter  (254.7 ns … 488.82 ns) 270.53 ns 487.69 ns 488.82 ns

runtime: bun 1.0.7 (x64-linux)
benchmark                time (avg)             (min … max)       p75       p99      p995
----------------------------------------------------------- -----------------------------
/A{1,3}B{1,3}/g      147.16 ns/iter (130.07 ns … 269.99 ns) 143.71 ns 266.04 ns 268.78 ns
/A{1,3}(B){1,3}/g    312.42 ns/iter (282.64 ns … 515.22 ns) 309.78 ns 493.53 ns 515.22 ns
/(A)(B){1,3}/g       201.18 ns/iter (177.85 ns … 343.12 ns) 199.57 ns 329.35 ns 332.58 ns
/(A){1,3}(B){1,3}/g    6.81 µs/iter     (6.53 µs … 7.69 µs)   6.94 µs   7.69 µs   7.69 µs

Do you want me to create a new issue or is this the same problem?

p0358 commented 6 months ago

@knowhatamine What are you talking about? This has nothing to do with this issue, where poor/unoptimized RegExp code path in JavaScriptCore (not only Bun, but other projects based on it too) causes bigger time, exact same problem is observed on native Linux and MacOS, because we tested. Nothing to do with WSL or file reading. Otherwise the bug report would be singled out about file reading, without including markdown parsing into it!

knowhatamine commented 6 months ago

@knowhatamine What are you talking about? This has nothing to do with this issue, where poor/unoptimized RegExp code path in JavaScriptCore (not only Bun, but other projects based on it too) causes bigger time, exact same problem is observed on native Linux and MacOS, because we tested. Nothing to do with WSL or file reading. Otherwise the bug report would be singled out about file reading, without including markdown parsing into it!

youre right. wrong tab.

knowhatamine commented 6 months ago

but interesting indeed, i just started using both bun and wsl2 and use a LOT of regex. will watch out for that. havent noticed anything, despite being extremely performance-obsessed.

p0358 commented 6 months ago

There's a few conditions that cause RegExp to not be JIT'ed (apparently), as one of the WebKit devs explained here: https://bugs.webkit.org/show_bug.cgi?id=258706#c2 Sadly there was no activity there ever since September.

And yeah I agree WSL is junk on its own, that thing refuses to even start up for me 80% of the time (sometimes both WSL2 and WSL1, and that's while Hyper-V normal VMs keep working, wonders of Windows).