microsoft / vscode-docker

Docker Extension for Visual Studio Code
https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-docker
Other
1.22k stars 517 forks source link

Why move node_modules to parent directory? #2071

Open nunovieira opened 4 years ago

nunovieira commented 4 years ago

https://github.com/microsoft/vscode-docker/blob/018c92815ff2eeed50ec9e9af1d842ebf7a7a329/resources/templates/node/Dockerfile.template#L5 This line intrigues me. Why the mv node_modules ../ part? Why move _nodemodules up a directory? Should this be in a generic Node.js Dockerfile template?

bwateratmsft commented 4 years ago

@philliphoff do you know why we do this?

philliphoff commented 4 years ago

It's from before my time on the extension, but my guess is that it was (at least an attempt at) a Docker image build layer optimization. It may also have been done to allow the flexibility to bind-mount the local application source into the container (rather than always build within the container); by keeping the node_modules folder in the parent (in the container as well as locally), you no longer have to worry about platform-specific libraries on the local machine not matching the platform of the container (as described here).

A downside is that there are some (fairly popular even) Node libraries which do not work properly in that arrangement.

bwateratmsft commented 4 years ago

I looked through Git history, it's been this way for a very long time.

@lostintangent more or less was the first to make it this way (up one level), albeit with changing to Yarn instead of NPM: https://github.com/microsoft/vscode-docker/pull/85 Then immediately changed it back to NPM: https://github.com/microsoft/vscode-docker/pull/86

Since then it's been moved a few times but the moving of node_modules up a level has persisted.

@lostintangent Do you remember the history behind this?

lostintangent commented 4 years ago

As @philliphoff mentioned, this was done to allow users to mount a local copy of their codebase into the container, without the local copy needing to include the restored node_modules (which might also be built for an entirely different operating system).

That said, it’s been quite a while, so I’m not sure if this strategy is still appropriate 😁

bwateratmsft commented 4 years ago

@nunovieira Does this arrangement affect one of your projects?

nunovieira commented 4 years ago

This doesn't affect me because I don't use this anymore. I start the "dockerization" by getting the Dockerfile from the Node.js guide. It's simpler and better explained. I never move _nodemodules. For development, I bind mount the application directory and have an anonymous volume for _nodemodules. In a compose file, there's something like:

    volumes:
      - ./app:/usr/src/app:delegated
      - /usr/src/app/node_modules

To have intellisense in vscode, I copy the _nodemodules from the container to the host. That code will never be executed in the host, it's just to get the definitions and introspection stuff.

IMO, this template is to kick-start the initial development. It could have some comments to help first-timers get it to a production ready state.

bwateratmsft commented 4 years ago

Yes, we can look at ways to improve the default Node Dockerfile with additional comments. @uchenkadicode has done quite a bit of that for the Python Dockerfile.

TheBrenny commented 3 years ago

So according to this blog post, it looks like it's so you can develop for two different systems at once (such as developing for a Windows host and a Linux container). But if you .dockerignore the node_modules/ folder, then moving it up one won't matter anyway? Even if a .dockeringore isn't present, according to the docker reference, COPY only "copies new files or directories". The word "new" stands out to me and makes it sound like if the file to be copied exists in the container, it won't be copied, meaning the entire node_modules directory would be ignored anyway?

I'm no docker expert, though. I was just curious as to why && mv node_modules ../ is present, and my hunting landed me here...

grokpot commented 2 years ago

The problem process is:

  1. It's common to bind-mount your host workspace to your container workspace to allow hot-reloading (e.g., - ./my-app:/app)
  2. The image is built, node modules are installed
  3. Bind mount from Step 1 is attached, but some/all dependencies are missing if the host node_modules differs from container's
  4. A common workaround it to use a data volume to exclude node_modules from Step 1. (e.g., - ./my-app/node_modules/). Although, a blank node_modules will still exist on the container.
  5. Data volumes (Step 4) are created once, meaning installing dependencies on host (maybe for the benefit of auto-writing to package.json) will not transfer to container.
  6. Solutions for Step 5 are:
  7. Step 6 Solution 3 means modules used in the scripts of package.json installed via node_modules will not work. E.g., nodemon server.js
  8. Solution to Step 7 is to use npx which (I believe) uses the same node_modules resolver pattern as Step 6 Solution 3

These steps result in a host-container environment with hot-reloading and detection/installation of dependency changes. Yay!

The irony of this is - when starting my new project:

  1. I immediately commented out the line RUN npm install --production --silent && mv node_modules ../ because I didn't understand _Why move nodemodules to parent directory?
  2. I spent hours understanding the problem
  3. In the end, I uncommented the line

I think y'all should leave it. It helps many people without them knowing the benefit.

The only thing to change would be maybe mv node_modules ../ to mv node_modules / so an absolute instead of relative path is used, which can trip up package.json execution in services like Google Cloud Run

zq0904 commented 2 years ago

The problem process is:

  1. It's common to bind-mount your host workspace to your container workspace to allow hot-reloading (e.g., - ./my-app:/app)
  2. The image is built, node modules are installed
  3. Bind mount from Step 1 is attached, but some/all dependencies are missing if the host node_modules differs from container's
  4. A common workaround it to use a data volume to exclude node_modules from Step 1. (e.g., - ./my-app/node_modules/). Although, a blank node_modules will still exist on the container.
  5. Data volumes (Step 4) are created once, meaning installing dependencies on host (maybe for the benefit of auto-writing to package.json) will not transfer to container.
  6. Solutions for Step 5 are:

  7. Step 6 Solution 3 means modules used in the scripts of package.json installed via node_modules will not work. E.g., nodemon server.js
  8. Solution to Step 7 is to use npx which (I believe) uses the same node_modules resolver pattern as Step 6 Solution 3

These steps result in a host-container environment with hot-reloading and detection/installation of dependency changes. Yay!

The irony of this is - when starting my new project:

  1. I immediately commented out the line RUN npm install --production --silent && mv node_modules ../ because I didn't understand _Why move nodemodules to parent directory?
  2. I spent hours understanding the problem
  3. In the end, I uncommented the line

I think y'all should leave it. It helps many people without them knowing the benefit.

The only thing to change would be maybe mv node_modules ../ to mv node_modules / so an absolute instead of relative path is used, which can trip up package.json execution in services like Google Cloud Run

First of all, thank you for your speech!

But I'm sorry, I still don't understand why, can you give me a simple example?