vercel / pkg

Package your Node.js project into an executable
https://npmjs.com/pkg
MIT License
24.3k stars 1.01k forks source link

ffi-napi unable to load embedded library #1744

Closed j1elo closed 1 year ago

j1elo commented 2 years ago

What version of pkg are you using?

5.8.0

What version of Node.js are you using?

16.17.0

What operating system are you using?

Ubuntu 20.04

What CPU architecture are you using?

x86_64

What Node versions, OSs and CPU architectures are you building for?

node16-linux-x64

Describe the Bug

I'm being unable to run a pkg'ed application that loads a .so file on runtime. This issue seemed at first possibly related to https://github.com/vercel/pkg/issues/1349, but in the end I think it is a different thing.

The system is Linux Mint 20.3 === Ubuntu 20.04

An application that uses vosk installs a shared library under node_modules/vosk/lib/linux-x86_64/libvosk.so; it also uses ffi-napi and ref-napi to load it, but the pkg'ed executable fails running with the following error:

$ dist/bin/myapp
pkg/prelude/bootstrap.js:1740
      throw error;
      ^

Error: Dynamic Linking Error: /snapshot/home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so: cannot open shared object file: No such file or directory
    at new DynamicLibrary (/snapshot/home/code/myapp/node_modules/ffi-napi/lib/dynamic_library.js:75:11)
    at Object.Library (/snapshot/home/code/myapp/node_modules/ffi-napi/lib/library.js:47:10)
    at Object.<anonymous> (/snapshot/home/code/myapp/node_modules/vosk/index.js:84:21)
    at Module._compile (pkg/prelude/bootstrap.js:1794:22)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1153:10)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Module.require (node:internal/modules/cjs/loader:1005:19)
    at Module.require (pkg/prelude/bootstrap.js:1719:31)
    at require (node:internal/modules/cjs/helpers:94:18)

I'm running pkg with this command: pkg pkg.config.json, where the file contents are as follows:

{
  "name": "myapp",
  "bin": "dist/main.js",
  "private": true,
  "pkg": {
    "assets": [
      "node_modules/ffi-napi/build/Release/ffi_bindings.node",
      "node_modules/ref-napi/prebuilds/linux-x64/node.napi.node",
      "node_modules/vosk/lib/linux-x86_64/libvosk.so"
    ],
    "outputPath": "dist/bin/",
    "targets": ["node16-linux-x64"]
  }
}

Lastly, these are the contents of /tmp/pkg/:

$ tree -a /tmp/pkg/
/tmp/pkg/
├── 94544448971968a49fabf8226f94e981d9a51f06abcecd8a98dd023986f814c2
│   └── ffi-napi
│       ├── build
│       │   └── Release
│       │       └── ffi_bindings.node
│       ├── lib
│       │   ├── bindings.js
│       │   ├── callback.js
│       │   ├── cif.js
│       │   ├── cif_var.js
│       │   ├── dynamic_library.js
│       │   ├── errno.js
│       │   ├── ffi.js
│       │   ├── _foreign_function.js
│       │   ├── foreign_function.js
│       │   ├── foreign_function_var.js
│       │   ├── function.js
│       │   ├── library.js
│       │   └── type.js
│       └── package.json
└── c0242e741053036807a4c1e2c3ed290cd56d19e97f5e106570434ed84191500c
    └── ref-napi
        ├── lib
        │   └── ref.js
        ├── package.json
        └── prebuilds
            └── linux-x64
                └── node.napi.node

My suspicion is that, given how pkg handles .node modules (Native addons), the ffi-napi and ref-napi modules get extracted to the host filesystem at /tmp/pkg/, but then these cannot find the libvosk.so file that only exists within the dist/bin/myapp executable.

Do you think I'm right there? Is there any way to help ffi-napi finding the file it is expecting, to make the application work?

Expected Behavior

I'd expect that the libvosk.so file is found and successfully loaded by the FFI library.

To Reproduce

package.json

{
  "name": "myapp",
  "version": "0.0.1",
  "dependencies": {
    "vosk": "0.3.39"
  },
  "devDependencies": {
    "pkg": "5.8.0"
  }
}

pkg.config.json

{
  "name": "myapp",
  "bin": "main.js",
  "private": true,
  "pkg": {
    "assets": [
      "node_modules/ffi-napi/build/Release/ffi_bindings.node",
      "node_modules/ref-napi/prebuilds/linux-x64/node.napi.node",
      "node_modules/vosk/lib/linux-x86_64/libvosk.so"
    ],
    "outputPath": "./",
    "targets": ["node16-linux-x64"]
  }
}

main.js

const vosk = require("vosk");

Commands

$ npm install
$ npx pkg --debug pkg.config.json
$ rm -rf /tmp/pkg/
$ DEBUG_PKG=1 ./myapp
robertsLando commented 2 years ago

but then these cannot find the libvosk.so file that only exists within the dist/bin/myapp executable.

You added it in assets like you did for the others so it should be mapped in /tmp folder too. It means that the path you provided in the assets could be wrong, try to run pkg with debug enabled to see if it has been added to the executable

j1elo commented 2 years ago

I tend to avoid writing paths by hand to avoid this kind of mistakes; in this case I ran a find command (find . | grep libvosk.so) and copied the result.

Creating the package with debug (using the reproduction sample I provided):

$ npx pkg --debug pkg.config.json

this can be seen:

> [debug]  Adding asset : .... 
  /home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so
> [debug] Content of /home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so is added to queue. It was required from /home/code/myapp/pkg.config.json
[...]
> [debug] Stat info of /home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so is added to queue.
[...]
> [debug] The file was included as asset content
  /home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so
> [debug] The file was included as asset content
  /home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so
[...]
> [debug] files & folders deduped = 
  libvosk.so
> [debug] The directory files list was included (1 item)
  /home/code/myapp/node_modules/vosk/lib/linux-x86_64
> [debug] The directory files list was included (1 item)
  /home/code/myapp/node_modules/vosk/lib/linux-x86_64

I'm not sure why, but all files seem to be included twice. But the dedupe step seems to know this and claims to remove duplicated files.

Later, cleaning up and running with debug enabled:

$ rm -rf /tmp/pkg/
$ DEBUG_PKG=1 ./myapp

the file can be seen to be there:

------------------------------- virtual file system
/snapshot
  myapp                                  100856052 
    node_modules                         100856052 
      vosk                               100031004 
        package.json                           573 
        index.js                             14359 
        lib                               85834332 
          linux-x86_64                    25994752 
            libvosk.so                    25994752

The path that Vosk's index.js uses to load the library is as follows (source here):

path.join(__dirname, "lib", "linux-x86_64", "libvosk.so")

which seems correct to me.

But, after running, the .so file wasn't extracted to /tmp, only the .node files (ffi-napi, ref-napi):

$ tree -a -L 2 /tmp/pkg/
/tmp/pkg/
├── 83433adbb4339ae54892655838b35cb9e76ea0d57e5ffb9eafd5c78114bf5524
│   └── ffi-napi
└── c0242e741053036807a4c1e2c3ed290cd56d19e97f5e106570434ed84191500c
    └── ref-napi
robertsLando commented 2 years ago

@j1elo

So the error says it cannot find this:

/snapshot/home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so

But it is mapped to (by checking the debug output):

/snapshot/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so

That's the error

j1elo commented 2 years ago

You're right! Good catch, I hadn't even realized it.

Now, following this clue... note that Vosk loads the library as mentioned in my previous comment:

path.join(__dirname, "lib", "linux-x86_64", "libvosk.so")

With the error being that it cannot find this:

/snapshot/home/code/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so

Which means that __dirname is being assigned (by pkg?) the value "/snapshot/home/code/myapp/node_modules/vosk" on runtime, but that doesn't match the actual path within the packaged file.

From this I get the impression that the path replacement done for __dirname should not include /home/code.

robertsLando commented 2 years ago

Here is explained how __dirname is set: https://github.com/vercel/pkg#snapshot-filesystem

I have a feel that you are packaging application from / ? There must be a reason to this

j1elo commented 2 years ago

I'll look more deeply into this. I'm not doing anything strange at all; my work dir has been written as /home/code/myapp/ in my examples, to simplify, but in reality is something not out of the ordinary:

/home/juan/work/projects/myapp/

Note that if you try the reproduction as shown in my first post, you should encounter the same scenario... if not, then we might find out that is some environment-specific issue.

j1elo commented 2 years ago

Testing with the repro case, the full prefix (/home/...) is not included in any of the paths that start with /snapshot/, so including this in my original report seems either a mistake on my side, or a different behavior (I'll keep digging to confirm).

But, for now, I'd be happy to focus solely on the repro case. I can confirm that:

Note that the error:

Error: Dynamic Linking Error: /snapshot/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so: cannot open shared object file: No such file or directory
    at new DynamicLibrary (/snapshot/myapp/node_modules/ffi-napi/lib/dynamic_library.js:75:11)
    at Object.Library (/snapshot/myapp/node_modules/ffi-napi/lib/library.js:47:10)
    at Object.<anonymous> (/snapshot/myapp/node_modules/vosk/index.js:86:21)

... indicates that it is ffi-napi which tries to load the given path. And ffi-napi is being extracted to the host filesystem. If this code (ffi-napi/lib/dynamic_library.js:75) is running from the host filesystem (/tmp/pkg/), then of course it won't find any path such as /snapshot/myapp/node_modules/, because that path does not exist on my computer. Is this maybe the actual problem?

j1elo commented 2 years ago

Note: I've edited the repro code, so the "assets" array contains paths to concrete files, instead of catch-all wildcards.

I.e. before:

    "assets": [
      "node_modules/ffi-napi/**/*",
      "node_modules/ref-napi/**/*",
      "node_modules/vosk/lib/**/*"
    ],

after:

    "assets": [
      "node_modules/ffi-napi/build/Release/ffi_bindings.node",
      "node_modules/ref-napi/prebuilds/linux-x64/node.napi.node",
      "node_modules/vosk/lib/linux-x86_64/libvosk.so"
    ],
robertsLando commented 2 years ago

hen of course it won't find any path such as /snapshot/myapp/node_modules/, because that path does not exist on my computer. Is this maybe the actual problem?

In general for .so and .node files them are copied on /tmp folder and served from local disk when a request to them is received. It means that we intercept the request to /snapshot/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so and give the content of it served from disk. BTW I have a feel this is not working as expected for some reason.

If you want to investigate further on this thhe piece of code that handles this is here:

https://github.com/vercel/pkg/blob/main/prelude/bootstrap.js#L2199

As a temporary solution you could try to manually patch code to see if using the /tmp path makes the appliucation work

j1elo commented 2 years ago

EDIT -- I was mentioning node-gyp-build but it really doesn't seem to have anything to do with the issue.

OK so the problem seems to be that pkg patches Node's process.dlopen function in JavaScript, but this function never gets actually called. I added a console.log to the first line of pkg's dlopen() patch (right here), and my message shows up for the .node files but not the .so file:

#### pkg's dlopen called: /snapshot/myapp/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#### pkg's dlopen called: /snapshot/myapp/node_modules/ffi-napi/build/Release/ffi_bindings.node

So, no patched dlopen() call shows up for /snapshot/myapp/node_modules/vosk/lib/linux-x86_64/libvosk.so.

After looking into ffi-napi, this is not a surprise:

i.e. pkg is intercepting calls to Node's dlopen(), but this doesn't cover calls made directly to the underlying system through an FFI library (application -> ffi-napi -> system's dlopen()).

Confirmation is obtained by doing this:

$ sudo mkdir -p /snapshot/myapp/node_modules/vosk/lib/linux-x86_64/
$ sudo ln -s \
    "$PWD/node_modules/vosk/lib/linux-x86_64/libvosk.so" \
    /snapshot/myapp/node_modules/vosk/lib/linux-x86_64/

The process launches correctly and without issues when the /snapshot path does exist in the host filesystem, proving that the lookup path is being performed on there, and not inside the pkg executable file.

I'm afraid this might be a roadblock; has nobody ever found this issue with .so libraries loaded with ffi-napi?

robertsLando commented 2 years ago

I'm afraid this might be a roadblock; has nobody ever found this issue with .so libraries loaded with ffi-napi?

I think the only way to fix this so is to create a patch the for ffi package like I did for pino in this issue: https://github.com/vercel/pkg/issues/1419#issuecomment-997982346

About your question I remember about some other issues with ffi-napi but not remember if them were related to this or else

j1elo commented 2 years ago

@robertsLando thanks a lot for the tips and guidance given so far!

One issue I'm finding is that the "dlopen" that pkg has, is a wrapper for Node's process.dlopen, which albeit sharing the name with the system's dlopen, its intended usage is quite different. In Node, process.dlopen is used to load binary modules (as in, Node.js "C++ Addon" files).

process.dlopen docs basically describe this function as "require() but for C++ Addons". These binary modules are expected to be explicitly written for Node, and they should register themselves into the Node module system, with global code such as this:

void Initialize(v8::Local<v8::Object> exports);
NODE_MODULE(module_name, Initialize)

Obviously this is very different from what the real dlopen does. So it's OK that pkg wraps the process.dlopen function to provide support for binary Node modules, but this has nothing to do with dlopening general .so shared libraries (or .dll on Windows).

process.dlopen(module, "node_modules/vosk/lib/linux-x86_64/libvosk.so") will always fail with Error: Module did not self-register, because libvosk.so is not a Node module. Thus, the version of process.dlopen wrapped by pkg is not appropriate for loading such file. The dlopen exported from C by ffi-napi is. So I would need to make pkg itself end up calling that function, and not process.dlopen.

This leaves us with a situation that is in no way solvable with a "patches" section in the pkg config, or at least I am unable to think of one. If the pkg extraction code was refactored to a public function, I might be able to call it from my code, but that would mean that pkg is no longer fully transparent to the application.

Maybe it is simply not possible to allow for this usage with how things are currently done with pkg; I'm fine with that, but before jumping to that conclusion I'd love to hear an official comment in this regard.

robertsLando commented 2 years ago

The dlopen exported from C by ffi-napi is

What you could try is to patch prelude/bootstrap file in order to catch that kind of import and make it work like it's expected by you. I dunno how ffi-napi works so I don't exactly get what you mean with

The dlopen exported from C by ffi-napi is

I can suppose it needs to call: https://github.com/node-ffi-napi/node-ffi-napi/blob/00df1232a25b1b0f026b5d1b4c9efc67497e4b48/lib/dynamic_library.js#L18

robertsLando commented 2 years ago

By double checking my previous answer I can suggest you to write a patch like I said above that patches this file and in case the path starts with '/snapshot/' simply read that file and write it to tmp dir (like pkg does with dlopen patch) and then set the path to that file in tmp, this should do the trick!

Just replace this._path = path;

with something like:

if (path.startsWith('/snapshot/')) {
    const fs = require('fs');
    const moduleContent = fs.readFileSync(path);

    const hash = require('crypto').createHash('sha256').update(moduleContent).digest('hex');
    const pathModule = require('path');
    const tmpFolder = pathModule.join(tmpdir(), hash);

    if (!fs.existsSync(tmpFolder)) {
        fs.mkdirSync(tmpFolder);

        const fileName = pathModule.basename(path);
        const tmpPath = pathModule.join(tmpFolder, fileName);
        fs.copyFileSync(path, tmpPath);
    }
}

this._path = tmpPath;

I didn't tested it but should work


"patches": {
            "./node_modules/ffi-napi/lib/dynamic_library.js": ["this._path = path;", "if (path.startsWith('/snapshot/')) { const fs = require('fs'); const moduleContent = fs.readFileSync(path); const hash = require('crypto').createHash('sha256').update(moduleContent).digest('hex'); const pathModule = require('path'); const tmpFolder = pathModule.join(tmpdir(), hash); if (!fs.existsSync(tmpFolder)) { fs.mkdirSync(tmpFolder); const fileName = pathModule.basename(path); const tmpPath = pathModule.join(tmpFolder, fileName); fs.copyFileSync(path, tmpPath); } } this._path = tmpPath;"],
        }
j1elo commented 2 years ago

It does work! :-D

I was 100% focused on using the existing functionality of pkg, but of course you are right, by duplicating a bit the code found in prelude/bootstrap.js, it is possible to achieve the same behavior that will allow ff-napi to load general-purpose .so files.

Summary of the solution

As mentioned, we'll be patching this line of ffi-napi, with this code:

if (path.startsWith("/snapshot/")) {
  const Fs = require("fs");
  const moduleContent = Fs.readFileSync(path);
  const hash = require("crypto").createHash("sha256").update(moduleContent).digest("hex");
  const Path = require("path");
  const tmpFolder = Path.join(require("os").tmpdir(), "pkg", hash);
  const newPath = Path.join(tmpFolder, Path.basename(path));
  if (!Fs.existsSync(tmpFolder)) {
    Fs.mkdirSync(tmpFolder, { recursive: true });
    Fs.copyFileSync(path, newPath);
  }
  path = newPath;
}
this._path = path;

The corresponding pkg config file is as follows:

{
  "name": "myapp",
  "bin": "main.js",
  "private": true,
  "pkg": {
    "assets": [
      "node_modules/ffi-napi/build/Release/ffi_bindings.node",
      "node_modules/ref-napi/prebuilds/linux-x64/node.napi.node",
      "node_modules/vosk/lib/linux-x86_64/libvosk.so"
    ],
    "outputPath": "./",
    "patches": {
      "node_modules/ffi-napi/lib/dynamic_library.js": [
        "this._path = path;",
        "if (path.startsWith('/snapshot/')) { const Fs = require('fs'); const moduleContent = Fs.readFileSync(path); const hash = require('crypto').createHash('sha256').update(moduleContent).digest('hex'); const Path = require('path'); const tmpFolder = Path.join(require('os').tmpdir(), 'pkg', hash); const newPath = Path.join(tmpFolder, Path.basename(path)); if (!Fs.existsSync(tmpFolder)) { Fs.mkdirSync(tmpFolder, { recursive: true }); Fs.copyFileSync(path, newPath); } path = newPath; } this._path = path;"
      ]
    },
    "targets": ["node16-linux-x64"]
  }
}

And by using the reproduction sample provided in the first message of this issue, it can be seen that it works as expected.

The patch could be improved to follow the original pkg code closer, and detect the name of the package after the /node_modules/ part of the path; otherwise, as it is now, the .so file is just extracted to the sha256 subdirectory, but this is good enough to make it work.

Thank you so much for the idea!

j1elo commented 2 years ago

I'm not sure of what is the actionable conclusion of this issue. While I found a solution, we might argue that it is just a workaround and that maybe pkg itself could include the patch already instead of letting all application writers fight this same problem whenever ff-napi is used in their codebase. Or maybe this could be simply made into a documentation addition, to warn future users of this whole issue.

robertsLando commented 2 years ago

Glad it worked! If you want you can make a PR and add it to patches folder, will be happy to review.

Also if you wish, consider to get me a coffie 🙏🏻

https://github.com/sponsors/robertsLando

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 90 days with no activity. Remove the stale label or comment or this will be closed in 5 days. To ignore this issue entirely you can add the no-stale label

j1elo commented 1 year ago

Solution outlined in comment https://github.com/vercel/pkg/issues/1744#issuecomment-1235706968 does work successfully; I've resumed work on this area and in following weeks (sorry my timeline advances very erratically) will be checking out what more is needed to make PR https://github.com/vercel/pkg/pull/1745 go though, so all users can benefit from this in the long term.