nodejs / postject

Easily inject arbitrary read-only resources into executable formats (Mach-O, PE, ELF) and use it at runtime.
Other
187 stars 14 forks source link

Add a test injecting a large resource #12

Open jviotti opened 2 years ago

dfellis commented 1 year ago

What is considered a large resource? I successfully tested this project with the new Node Single Executable Application functionality with a relatively trivial size, but when I created a bundle (and confirmed execution of the bundle works with a node build/main.js call) that is 3.1MB in size, postject provides a very simple error that it failed:

$ npx postject node NODE_SEA_BLOB sea-prep.blob --sentinel-fuse NODE_SEA_FUSE_fce680ab2cc467b6e072b8b5df1996b2
Start injection of NODE_SEA_BLOB in node...
Error: Error when injecting resource

I can't find any arg to enter a verbose mode, so not sure how to debug further.

RaisinTen commented 1 year ago

What is considered a large resource?

I was trying to find that out in https://github.com/nodejs/postject/pull/7 where I tried injecting large files into a tiny hello-world executable and I posted about my findings in https://github.com/nodejs/single-executable/issues/23#issuecomment-1252283739.

that is 3.1MB in size, postject provides a very simple error that it failed

@dfellis looking at the command, I'm guessing that happened on Linux? (very curious what the distro name and the arch is) I tried injecting a 16MB file into the official Node.js Linux binary (92 MB) I downloaded from the website and it is able to finish the injection with no errors for me (final executable size - 107 MB). It also ran with no errors.

I can't find any arg to enter a verbose mode, so not sure how to debug further.

Assuming that this is an ELF binary, it means that the error was reported from https://github.com/nodejs/postject/blob/3edd1dd1e0690167d7db0f502620e41888b2f82e/src/postject.cpp#L48 which would happen if https://github.com/nodejs/postject/blob/3edd1dd1e0690167d7db0f502620e41888b2f82e/src/postject.cpp#L45 returned null. As you said, there were no additional logs, so it probably means that there is a part in LIEF's code where this error happens but it doesn't log about it. To find that out, you would need to build Postject from source on your system and trace the calls inside LIEF's (a C++ dependency of this project that does the actual injection) LIEF::ELF::Parser::parse function - https://github.com/nodejs/postject/blob/3edd1dd1e0690167d7db0f502620e41888b2f82e/vendor/lief/src/ELF/Parser.cpp#L346-L355.

dfellis commented 1 year ago

@RaisinTen thanks for replying. Couldn't get back to this until today.

It's a mostly stock Ubuntu 22.04 LTS (with System76 kernel and drivers). The node binary I'm using is from nvm

$ whereis node
node: /home/damocles/.nvm/versions/node/v20.0.0/bin/node

But I was under the impression that these are basically just stock, as well.

As for building from source and tracing, is there a particular version of emscripten that should be used, as it appears that it's presumed to be set up outside of the build system?

Trying to catch throw in gdb isn't picking up on the failure, though (I don't have a debug build yet, but I figured I could try to figure out how to grab the error first and then figure out the debug build second). Any suggestions on that front?

dfellis commented 1 year ago

Hmm... so while waiting, I decided to just dig into the Node.js code for postject to see what's going on, adding some console.logs to the inject function, and I got something interesting:

Start injection of NODE_SEA_BLOB in node...
{ executableFormat: ctor {} }
[ 'value', 'constructor' ]
3
[Function: ExecutableFormat_kUnknown]
undefined
[Function: ctor] {
  values: { '0': ctor {}, '1': ctor {}, '2': ctor {} },
  kAlreadyExists: ctor {},
  kError: ctor {},
  kSuccess: ctor {}
}
Error: Error when injecting resource

Two things:

  1. The check that the executable format is supported is incorrect, an unknown executable format is truthy.
  2. I think I realized what's going on here. When I try to build stuff with yarn build, yarn is injecting its own wrapper around node in the $PATH (for some reason that I'm not entirely sure of), like this:
$ cat /tmp/yarn--1682713927311-0.8177720901713921/node
#!/bin/sh

exec "/home/damocles/.nvm/versions/node/v20.0.0/bin/node" "$@"

And that is not a binary, so when my build script (which is mostly just a copy-paste of the official docs) runs, command -v node returns that weird wrapper, which then grabs the wrong file to attach payload to.

And now when I run my build script with npm run build instead, I get

Start injection of NODE_SEA_BLOB in node...
{ executableFormat: ctor {} }
[ 'value', 'constructor' ]
0
[Function: ExecutableFormat_kELF]
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
warning: Can't find string offset for section name '.note.100'
{
  sectionName: 'NODE_SEA_BLOB',
  result: ctor {},
  data: Uint8Array(99945600) [
    127,  69,  76, 70,  2, 1,   1, 3,   0,   0,  0, 0,
      0,   0,   0,  0,  2, 0,  62, 0,   1,   0,  0, 0,
     32, 166, 185,  0,  0, 0,   0, 0, 224, 252, 22, 2,
      0,   0,   0,  0,  0, 0, 245, 5,   0,   0,  0, 0,
      0,   0,   0,  0, 64, 0,  56, 0,  14,   0, 64, 0,
     50,   0,  40,  0,  6, 0,   0, 0,   4,   0,  0, 0,
     64,   0,   0,  0,  0, 0,   0, 0,  64,   0, 64, 0,
      0,   0,   0,  0, 64, 0,  64, 0,   0,   0,  0, 0,
    216,   2,   0,  0,
    ... 99945500 more items
  ]
}
💉 Injection done!

Confirming that it's not a "too large" issue, it's that yarn wraps calls to node with its own logic and postject was not catching this.

If you want, I can make a small PR to correct this (already confirmed here):

Start injection of NODE_SEA_BLOB in node...
Error: Executable must be a supported format: ELF, PE, or Mach-O
error Command failed with exit code 1.
RaisinTen commented 1 year ago

Thanks for looking into this, great work! 🎉


And to answer your questions:

is there a particular version of emscripten that should be used

Not really because Postject's CircleCI config currently uses the latest commit from emsdk's main - https://github.com/nodejs/postject/blob/3edd1dd1e0690167d7db0f502620e41888b2f82e/.circleci/config.yml#L105, so I think that should work for you too.

Trying to catch throw in gdb isn't picking up on the failure

Are you referring to the exceptions that are being thrown in JS? That actually causes the Node.js process to print the error info and exit.

dfellis commented 1 year ago

Thanks for looking into this, great work! tada

And to answer your questions:

is there a particular version of emscripten that should be used

Not really because Postject's CircleCI config currently uses the latest commit from emsdk's main -

https://github.com/nodejs/postject/blob/3edd1dd1e0690167d7db0f502620e41888b2f82e/.circleci/config.yml#L105 , so I think that should work for you too.

Trying to catch throw in gdb isn't picking up on the failure

Are you referring to the exceptions that are being thrown in JS? That actually causes the Node.js process to print the error info and exit.

I was running gdb node --args ./node_modules/.bin/postject ... and then tried to use the catch throw command to catch exceptions down at the binary level, hoping to find the LIEF error that way, even if I didn't have debug symbols, to make sure I was debugging it correctly, but that didn't work, and while probing in other ways I figured out what was actually happening. (My second comment.)