heroku / buildpacks-nodejs

Heroku's Cloud Native Buildpacks for Node.js applications.
BSD 3-Clause "New" or "Revised" License
2 stars 2 forks source link

`Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'` due to NPM writing to `/layers` at runtime #800

Closed Crazybun0422 closed 5 months ago

Crazybun0422 commented 8 months ago

image I reckon these Images are indeed bug existing.

Crazybun0422 commented 8 months ago

I tried them all image

joshwlewis commented 8 months ago

@Crazybun0422 - Looks like something is trying to create a directory in /layers/heroku_nodejs-npm-install at runtime. I'm not surprised that you would see EACCES for that. Do you have any insights into what might be doing that? Are you calling npm at runtime in some way?

Crazybun0422 commented 8 months ago

@Crazybun0422 - Looks like something is trying to create a directory in /layers/heroku_nodejs-npm-install at runtime. I'm not surprised that you would see EACCES for that. Do you have any insights into what might be doing that? Are you calling npm at runtime in some way?

I did npm install when initializing the app, however, this error is reported as I did execute 'node ./web/index.js'. that's difficult to understand why, and this error just appeared when I pull this image last time. so I just guess all these images actually released bug existing.

Crazybun0422 commented 8 months ago

image I suddenly found this 'heroku/builder-classic:22' is worked, I didn't try this..

Crazybun0422 commented 8 months ago

heroku/builder-classic:22 Only this version is work. I tested times of them @joshwlewis

but seems like all the APIs are deprecated, how I could modify my app to satisy these? image I utilized these images to deploy my shopify app, whether api collections of shopify are not matched for new images of heroku?

joshwlewis commented 8 months ago

@Crazybun0422 heroku/builder-classic:22 is deprecated, and part of the reason is that it uses an older Buildpack API (0.4, 0.10 is current). I don't think you'll be able to fix those deprecation warnings.

I highly suggest heroku/builder:22 instead. I'd like to understand more about the issue with heroku/bulder:22 - can you try starting the app with DEBUG=* environment variable (that should print out a few more logs)?

edmorley commented 8 months ago

@Crazybun0422 Have you had a chance to try the above? Also, could you provide more information on when this stopped working? The build log from the last successful build would be very helpful.

edmorley commented 7 months ago

(I've transferred this issue from https://github.com/heroku/cnb-builder-images to https://github.com/heroku/buildpacks-nodejs)

edmorley commented 7 months ago

xref: https://community.fly.io/t/re-deploy-my-nodejs-app-and-sadly-got-crashed/18457

edmorley commented 7 months ago

@joshwlewis @colincasey So I can reproduce this easily using the Node.js getting started guide, if I switch to a different user at runtime (that doesn't have write access to /layer). (I actually encountered this issue whilst trying out Heroku-24's split build+run user change against all existing CNBs to see what the fallout would be, and remembered seeing it here.)

It looks like NPM or something is trying to write out a cache to the /layers directory, when it doesn't have permissions to do so. (Only /tmp or $HOME are writable when using a different user at runtime).

For simplicity in this repo I've used the nobody user, but imagine it's instead the run-image-user (which for Heroku-24 would be heroku, vs the build-image user heroku-build):

$ git clone https://github.com/heroku/node-js-getting-started && cd node-js-getting-started
$ pack build node-test
...
$ docker run --rm -it --user nobody --env npm_config_loglevel=silly node-test
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
npm verb cli /layers/heroku_nodejs-engine/dist/bin/node /layers/heroku_nodejs-engine/dist/bin/npm
npm info using npm@10.2.4
npm info using node@v20.11.1
npm verb cache could not create cache: Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'
npm verb logfile could not create logs-dir: Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'
npm verb title npm start
npm verb argv "start"
npm verb logfile logs-max:10 dir:/layers/heroku_nodejs-npm-install/npm_cache/_logs/2024-03-18T10_02_30_443Z-
npm verb logfile could not be created: Error: ENOENT: no such file or directory, open '/layers/heroku_nodejs-npm-install/npm_cache/_logs/2024-03-18T10_02_30_443Z-debug-0.log'
npm verb logfile no logfile created
npm sill logfile done cleaning log files

> node-js-getting-started@0.3.0 start
> node index.js

npm verb stack Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'
npm verb cwd /workspace
npm verb Linux 6.6.16-linuxkit
npm verb node v20.11.1
npm verb npm  v10.2.4
npm ERR! code EACCES
npm ERR! syscall mkdir
npm ERR! path /layers/heroku_nodejs-npm-install
npm ERR! errno -13
npm ERR! Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'
npm ERR!  [Error: EACCES: permission denied, mkdir '/layers/heroku_nodejs-npm-install'] {
npm ERR!   errno: -13,
npm ERR!   code: 'EACCES',
npm ERR!   syscall: 'mkdir',
npm ERR!   path: '/layers/heroku_nodejs-npm-install'
npm ERR! }
npm ERR!
npm ERR! The operation was rejected by your operating system.
npm ERR! It is likely you do not have the permissions to access this file as the current user
npm ERR!
npm ERR! If you believe this might be a permissions issue, please double-check the
npm ERR! permissions of the file and its containing directories, or try running
npm ERR! the command again as root/Administrator.
npm verb exit -13
npm verb unfinished npm timer command:start 1710756150561
npm verb unfinished npm timer command:run-script 1710756150568
npm verb code -13

npm ERR! Log files were not written due to an error writing to the directory: /layers/heroku_nodejs-npm-install/npm_cache/_logs
npm ERR! You can rerun the command with `--loglevel=verbose` to see the logs in your terminal
colincasey commented 7 months ago

Interesting find @edmorley. I'll play around with this reproduction later today but it seems like the culprit here may be npm's logging.

edmorley commented 7 months ago

I don't think the logging is the root issue, the inability to write a cache is. It's just that when the original error occurs, NPM tries to create a log file about it, which also fails, but a previous error has occurred either way.

joshwlewis commented 7 months ago

I don't think the logging is the root issue, the inability to write a cache is.

The cache would have been written during the build phase. It's surprising to me that npm is trying to create one. A rough guess: npm at runtime doesn't know where the cache was at build time.

joshwlewis commented 7 months ago

Looks like the cache layer does get written correctly during the build phase, but is not available at runtime, (by design - it reduces image size to leave it out). npm start at runtime doesn't see the cache, and tries to create it.

A few options I can think of:

  1. Make the cache layer available at launch with launch = true. This would increase the image size, and probably/possibly include stale and or irrelevant cache data.
  2. Run npm config set cache none in the global npm config (this seems to prevent using and/or repopulating the cache), but only at runtime. We'd still want to use the cache at build time.
  3. Set npm config set cache /workspace/some/writable/path in the global npm config so the cache can be repopulated at runtime.
colincasey commented 7 months ago

@joshwlewis @edmorley I was hoping it would be something simple like the cache folder doesn't exist at launch so the logging fails but you're right, the logging failures are not the main cause.

The npm start command is definitely trying to do something with the cache but it's unclear to me what that might be at the moment. There is code in npm that attempts to recursively mkdir the cache directory but it won't fail immediately if it can't.

I also shelled into the built image and tried the following commands with no issue:

So maybe there's something specific about the start command that triggers this? Will keep digging.

colincasey commented 7 months ago

@joshwlewis @edmorley the underlying cause here seems to be the update-notifier causing this issue.

When an npm command is executed, an update check is registered to execute asynchronously. The notifier makes a request out to the npm registry for the npm packument metadata (so it can read the versioning). These operations all involve the configured npm cache which is, for this buildpack, is not available at runtime and, even if it was available, should not be writable. The end result appears to be an uncaught exception that causes the error above.

Following the reproduction steps that @edmorley outlined above with a slight alteration to disable the update notifier works:

git clone https://github.com/heroku/node-js-getting-started && cd node-js-getting-started
pack build node-test
docker run --rm -it --user nobody --env npm_config_loglevel=silly --env npm_config_update-notifier=false node-test

I'm going to file an issue with npm, but in the meantime, we should consider:

colincasey commented 7 months ago

Link to existing issue with npm - https://github.com/npm/cli/issues/7044. This should be fixed in 10.2.5 by https://github.com/npm/cli/pull/7061.

edmorley commented 6 months ago

It looks like the default user change for Fly.io (that was what triggered this issue being opened) was unintentional, and was fixed in: https://github.com/superfly/flyctl/pull/3396

In addition, the NPM fix for the update-notifier trying to initialise the cache has been fixed in NPM.

As such, building the Node.js getting started guide with a read-only filesystem (via the docker run --user nobody trick/hack) now succeeds: https://github.com/heroku/cnb-builder-images/actions/runs/8877305233/job/24370680274

Longer term we should probably test that all of our buildpacks work with read-only mode (when combined with an app that does the right thing; we can't do anything about app code/app dependencies that try to write to places they shouldn't). xref: https://github.com/heroku/base-images/issues/268

For Node.js specifically, is there any other follow-up work here? For example, it sounded like NPM tries to write logs to a read-only directory in the case of an error occurring. Presumably errors can occur for reasons other than the update-notifier not being able to write to the cache (for example if npm run start fails to run due to an app code issue), so it seems might be worth having the buildpack configure NPM to write the logs directory to /tmp regardless?