oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
73.31k stars 2.69k forks source link

Segfault adding embedded file to `build --compile` #13522

Closed ctjlewis closed 4 days ago

ctjlewis commented 4 weeks ago

How can we reproduce the crash?

Adding embedded file to bun build --compile ... from #13421 with the following pattern:

    "bundle": "bun build src/**.ts src/**.tsx --splitting --outdir dist --target node",
    "postbundle": "cp node_modules/yoga-wasm-web/dist/yoga.wasm dist",
    "compile": "bun build --compile dist/bin.js dist/yoga.wasm --outfile dist/bin",

Running bundle:

dist
├── bin
├── bin.js
├── devtools-jk2ya82f.js
├── experiment-m3g28wzr.js
├── experiment-wfe5y8kq.js
├── experiment-znt6ydvp.js
├── experiment.js
├── test.js
└── yoga.wasm

Running compile:

$ bun build --compile dist/bin.js dist/yoga.wasm --outfile dist/bin
============================================================
Bun Canary v1.1.27-canary.3 (7529cd76) macOS Silicon
macOS v14.6.1
Args: "bun" "build" "--compile" "dist/bin.js" "dist/yoga.wasm" "--outfile" "dist/bin"
Features: define dotenv tsconfig_paths tsconfig 
Elapsed: 69ms | User: 72ms | Sys: 11ms
RSS: 69.21MB | Peak: 69.21MB | Commit: 1.07GB | Faults: 36

panic(main thread): Segmentation fault at address 0x4
oh no: Bun has crashed. This indicates a bug in Bun, not your code.

To send a redacted crash report to Bun's team,
please file a GitHub issue using the link below:

 https://bun.report/1.1.27/Mb27529cd7AgsgggD__23umI2gnwI2l0xGmhmxG__A2AI

error: script "compile" was terminated by signal SIGTRAP (Trace or breakpoint trap)
Trace/BPT trap: 5

If we run without dist/yoga.wasm, we can compile it, but get the old error (and running bun dist/bin.js works without issue):

error: Cannot find module "./yoga.wasm" from "/$bunfs/root/bin"

Relevant log output

bun build src/**.ts src/**.tsx --splitting --outdir dist --target node

  experiment.js             322.09 KB

  test.js                   2.09 KB

  bin.js                    282.77 KB

  devtools-jk2ya82f.js      681.75 KB

  experiment-m3g28wzr.js    1023.13 KB

  experiment-wfe5y8kq.js    1.12 KB

  experiment-znt6ydvp.js    0.24 KB

Stack Trace (bun.report)

Bun v1.1.27-canary (7529cd7) on macos aarch64 [BuildCommand]

Segmentation fault at address 0x00000004

Features: define, dotenv, tsconfig_paths, tsconfig

bephrem1 commented 4 weeks ago

@Jarred-Sumner @ctjlewis Do we have any workarounds at the moment for getting CLI apps using Ink to build?

ctjlewis commented 4 weeks ago

cc @Jarred-Sumner, @bephrem1 you can probably close your issue, I'm all over this between 2 different issues, they got that other linked PR in specifically for this case. The embedded files segfaults right now, this issue will cover that. Repro is specifically for that case.

ctjlewis commented 4 weeks ago

@bephrem1 Sorry, forgot to share my shim for now. If you make a bundle, shim the require, and compile that, as long as yoga.wasm is next to it at runtime, the executable program will work.

// shim.ts
const bin = Bun.file("dist/bin.js")
let content = await new Response(bin).text()

// Replace createRequire(import.meta.url) with require
content = content.replace(/createRequire\(import\.meta\.url\)/g, "require")

// Replace $(import.meta.url) where $ is "createRequire as something"
const createRequireAsRegex = /(?<=createRequire as )(.+?)(\w*)(?=\})/g
const matches = content.match(createRequireAsRegex)

if (!matches?.length) {
  throw new Error("No matches found")
}

for (const match of matches) {
  const pattern = `${match}(import.meta.url)`
  content = content.replaceAll(pattern, "require")
}

await Bun.write("dist/bin.js", content)
console.log("Successfully shimmed dist/bin.js")
    "bundle": "bun build --bundle --minify src/bin.tsx --outfile dist/bin.js --target node",
    "postbundle": "bun shim.ts && cp node_modules/yoga-wasm-web/dist/yoga.wasm dist",
    "compile": "bun build --compile dist/bin.js --outfile dist/bin",
    "build": "bun run bundle && bun run compile",
bephrem1 commented 4 weeks ago

Awesome, thanks! Will the final build command be bun build --compile dist/bin.js dist/yoga.wasm --outfile dist/bin, or will it "just work" without the node_modules/yoga-wasm-web/dist/yoga.wasm copy step?

ctjlewis commented 4 weeks ago

bun run build will make the bundle and compile it. You will need the copy step to move the yoga.wasm from deep in node_modules where it's installed as a dependency, to right next to the bundle at ./dist/yoga.wasm.

The code in ink that asks for ./yoga.wasm won't be in node_modules after it's bundled or compiled, it will be in dist, and ./yoga.wasm will need to be next to the output bundle/executable: so yoga.wasm needs to be copied next to it, because that's where it will look when the executable runs.

When this issue is fixed, the file will be included in the binary and it won't need to read from an external file at runtime. We'll do bun build --compile src/bin.tsx node_modules/yoga-wasm-web/dist/yoga.wasm and it will embed the file directly into the executable and you can just distribute it directly.

For now, if you distribute the program, you'll need to install the yoga.wasm and bin to the same install directory on user machines.

bephrem1 commented 4 weeks ago

This makes sense — moreso wondering will there come a point where we can just run bun build --compile src/index.tsx --outfile dist/bin & Bun auto-detects the file import happening within Ink (& grabs yoga.wasm for the executable build for us)? Or will we always need to detect these static files imported libraries use, copy them to dist, & add to bun build --compile?

For now, if you distribute the program, you'll need to install the yoga.wasm and bin to the same install directory on user machines.

yep, got it

When this issue is fixed, the file will be included in the binary and it won't need to read from an external file at runtime.

great 🎉

By the way thanks for the quick help & working on this 👍

ctjlewis commented 4 weeks ago

Well, in this case, not without an upstream yoga-wasm-web update, regarding not having to manually embed or do anything. The reason is because the code they publish specifically await readFile(...) with some complicated stuff in the middle that actually surfaced the first part of this issue that requires the shim - just resolving what file to read, createRequire(import.meta.url).resolve("./yoga.wasm"), which we shim to require.resolve to get it to work correctly. But they don't just import "./yoga.wasm".

So the yoga-wasm-web program itself right now is really what is externally reading the file. So Bun can't do it for us, it's a property of the program when it runs that it expects that file on the disk, right next to itself. So we will have to ship it with the bundle or the executable, or embed it when it is fixed. When embedding this works, the executable "sees" it at ./yoga.wasm in the embedded filesystem, and it is "included" in ./dist/bin, so you won't need to distribute yoga.wasm, but you will need to tell Bun to embed it in the --compile call:

bun build --compile src/bin.tsx node_modules/yoga-wasm-web/dist/yoga.wasm
ctjlewis commented 4 weeks ago

OK, one last comment on this because actually, we cannot just embed from node_modules either, and we will need to copy still as of right now. This is because the embedded files map to their actual locations when they're embedded, yet the expected paths of relative paths in dependencies will change and no longer be in various node_modules folders, but will now all be relative to --outfile. This is true for --bundle or --compile.

So, for instance, for ./yoga.wasm to be the embedded file for the current setup (bundle -> compile, redundant later):

|- dist/
  |- bin.js
  |- yoga.wasm   # must copy from node_modules

We need to copy from node_modules, because we'll embed ./yoga.wasm in the executable relative to entrypoint ./dist/bin.js:

bun build --compile dist/bin.js dist/yoga.wasm

When we eliminate the redundant bundle step because compile works, it needs to be in src/, but either way you copy, and it goes next to the entry point:

bun build --compile src/bin.tsx src/yoga.wasm
ctjlewis commented 4 weeks ago

Actually, because this is a top-level readFile(), and not a function, Bun actually could automatically embed the file... It's genuinely an import side effect that can be inlined. () => readFile("./yoga.wasm") could not, because that's something that runs later, and not something that happens when you import the module.

Though this is truly so galaxy brained that I've never seen it done. Bundling typically, as a process, you have to manage files your dependencies load as side effects at runtime, and move them relative to the outfile as they were to their original dependency entrypoint. IDK if any JS bundlers have tried to do this automatically before, but it should be possible in theory.

./yoga.wasm belongs to the import behavior of node_modules/yoga-wasm-web/dist/index.js or whatever, it never changes and it does not belong to the user, it's part of the dependency we're bundling when we import it. It would still be embedded as ./yoga.wasm, keeping its relative position to bundled --outfile as it was to the original dependency entrypoint.

Jarred-Sumner commented 4 weeks ago

debug logs:

❯ bun-debug build --compile dist/bin.js dist/yoga.wasm --outfile dist/bin
[SYS] openat(-2, <repo>/bunfig.toml) = -1
[SYS] openat(7[<repo>], package.json) = 8
[fs] openat(7[<repo>], <repo>/package.json) = 8[<repo>/package.json]
[SYS] close(8[<repo>/package.json])
[alloc] new(PackageJSON) = src.resolver.package_json.PackageJSON@12a704080
[SYS] openat(7[<repo>], tsconfig.json) = 8
[fs] openat(7[<repo>], <repo>/tsconfig.json) = 8[<repo>/tsconfig.json]
[alloc] new(TSConfigJSON) = src.resolver.tsconfig_json.TSConfigJSON@12b004080
[SYS] close(8[<repo>/tsconfig.json])
[SYS] close(3[/])
[SYS] close(4[/Users])
[SYS] close(5[/Users/jarred])
[SYS] close(6[/Users/jarred/Build])
[SYS] close(7[<repo>])
[ThreadPool] 16 workers
[SYS] close(4[<repo>/node_modules])
[SYS] openat(-2, <repo>/node_modules/dist) = -1
[SYS] openat(-2, <repo>/dist) = 4
[SYS] close(4[<repo>/dist])
[SYS] openat(-2, <repo>/node_modules/dist) = -1
[ThreadPool] Worker.create()
[ThreadPool] Worker.create()
[ThreadPool] Worker.create()
[SYS] openat(-2, <repo>/dist/yoga.wasm) = 4
[fs] openat(0, <repo>/dist/yoga.wasm) = 4[<repo>/dist/yoga.wasm]
[SYS] openat(-2, <repo>/dist/bin.js) = 5
[fs] openat(0, <repo>/dist/bin.js) = 5[<repo>/dist/bin.js]
[SYS] close(4[<repo>/dist/yoga.wasm])
[SYS] close(5[<repo>/dist/bin.js])
[Bundle] onParse(2, <repo>/dist/yoga.wasm) = 0 imports, 0 exports
[Bundle] onParse(1, <repo>/dist/bin.js) = 2 imports, 0 exports
[Bundle] onParse(0, runtime) = 0 imports, 21 exports
[Bundle] Parsed 3 files, producing 3 ASTs
[LinkerCtx] Step 1: 0 CommonJS modules (+ 0 wrapped), 2 ES modules (+ 0 wrapped)
[ThreadPool] Worker.create()
[LinkerCtx] Binding 0 imports for file runtime (#0)
[LinkerCtx] Binding 2 imports for file <repo>/dist/bin.js (#1)
[LinkerCtx] Binding 0 imports for file <repo>/dist/yoga.wasm (#2)
[part_dep_tree] markPartLiveForTreeShaking: 1:3 -- 1:2
[part_dep_tree] markPartLiveForTreeShaking 1:2 | EMPTY
[part_dep_tree] markPartLiveForTreeShaking: 1:4 -- 1:1
[part_dep_tree] markPartLiveForTreeShaking 1:1 | EMPTY
[part_dep_tree] markPartLiveForTreeShaking: 1:4 -- 1:3
[part_dep_tree] markPartLiveForTreeShaking: 1:5 -- 1:4
[part_dep_tree] markPartLiveForTreeShaking 1:6 | EMPTY
[part_dep_tree] markPartLiveForTreeShaking 2:1 | EMPTY
[part_dep_tree] markPartLiveForTreeShaking: 2:3 -- 2:2
[part_dep_tree] markPartLiveForTreeShaking 2:2 | EMPTY
[LinkerCtx]  START 2 renamers
[LinkerCtx]   DONE 2 renamers
[LinkerCtx]  START 2 compiling part ranges
[LinkerCtx]   DONE 2 compiling part ranges
[LinkerCtx]  START 2 postprocess chunks
[LinkerCtx]   DONE 2 postprocess chunks
============================================================
Bun Debug v1.1.27 (4cf1e370) macOS Silicon
macOS v14.4.1
Args: "bun-debug" "build" "--compile" "dist/bin.js" "dist/yoga.wasm" "--outfile" "dist/bin"
Features: define tsconfig
Elapsed: 12ms | User: 10ms | Sys: 27ms
RSS: 50.63MB | Peak: 50.63MB | Commit: 1.04GB | Faults: 72

panic(main thread): reached unreachable code
/Users/jarred/Code/bun/src/bun.zig:3362:43: 0x10416167f in assert (bun-debug)
        if (comptime Environment.isDebug) unreachable;
                                          ^
/Users/jarred/Code/bun/src/bundler/bundle_v2.zig:9643:31: 0x1042f75a7 in appendIsolatedHashesForImportedChunks (bun-debug)
                    bun.assert(additional_files.len > 0);
                              ^
/Users/jarred/Code/bun/src/bundler/bundle_v2.zig:9221:56: 0x1042f0f77 in generateChunksInParallel (bun-debug)
                c.appendIsolatedHashesForImportedChunks(&hash, chunks, @intCast(index), &chunk_visit_map);
                                                       ^
/Users/jarred/Code/bun/src/bundler/bundle_v2.zig:1075:56: 0x1042fa0cb in generateFromCLI (bun-debug)
        return try this.linker.generateChunksInParallel(chunks);
                                                       ^
/Users/jarred/Code/bun/src/cli/build_command.zig:294:49: 0x1043309cb in exec (bun-debug)
            break :brk (BundleV2.generateFromCLI(
                                                ^
/Users/jarred/Code/bun/src/cli.zig:1538:38: 0x1045d7f87 in start (bun-debug)
                try BuildCommand.exec(ctx);
                                     ^
/Users/jarred/Code/bun/src/cli.zig:62:22: 0x10415947f in start (bun-debug)
        Command.start(allocator, log) catch |err| {
                     ^
/Users/jarred/Code/bun/src/main.zig:50:22: 0x104157a1b in main (bun-debug)
    bun.CLI.Cli.start(bun.default_allocator);
                     ^
/Users/jarred/Code/bun/src/deps/zig/lib/std/start.zig:514:22: 0x10415789b in main (bun-debug)
            root.main();
                     ^
???:?:?: 0x19da9a0df in ??? (???)

fish: Job 1, 'bun-debug build --compile dist/…' terminated by signal SIGTRAP (Trace or breakpoint trap)
bephrem1 commented 3 weeks ago

Are there any updates on this? :-)

ctjlewis commented 3 weeks ago

😕 give them a few days king

you can compile and distribute with the workaround for now: https://github.com/oven-sh/bun/issues/13522#issuecomment-2311173485