CaravanStudios / open-product-recovery

Apache License 2.0
26 stars 20 forks source link

Improve example server container dependency loading #44

Closed rvenables closed 1 year ago

rvenables commented 1 year ago

Currently, the example server docker image builds using the last published packages in the public package registry [^1].

Under most conditions, this approach works perfectly fine. However, this arrangement inconveniences contributors attempting to use the docker image against a modified local source. Local (not yet published in the package registry) OPR package changes are not reflected in the docker image, but the latest source for the server is included.

In addition, breaking changes to the local package source (from the perspective of the example server) will cause the docker test script to fail locally and in GitHub.

[^1]: See lines 3-6 on https://github.com/google/open-product-recovery/blob/35cf411f8b939e71b7ec00cca37488bf96445c45/opr-example-server/Dockerfile.dev.

rvenables commented 1 year ago

I’ll look at this one and suggest a solution in the next day or two.

rvenables commented 1 year ago

Tentative Suggested Candidate Solution: Copy Relevant OPR Packages to Image Prior to NPM Install Extract the relevant OPR modules, by convention (“opr-“) from the package.json for the server, copy those from the existing symlinks in node_modules (so not to be dependent on exact location) into a node_modules.docker directory, and then update the docker build to pre-fill those modules into the container node_modules directory[^1].

As long as the module naming holds, seemingly little maintenance is required to keep this working and only the relevant OPR modules explicitly defined as a dependency in the package.json will be included (helpful for keeping performance reasonable when creating the image).

I tested this out briefly and it seems to work while being fast, low-maintenance at least on the surface, and pretty quick to build. However, I’m curious to hear other solutions. I was surprised that I couldn’t find an easy clearly superior alternative with a bit of searching. However, that doesn’t mean that one doesn’t exist.

Here are a few more solutions I briefly considered but don’t recommend:

  1. Copy All Host Node Modules to Image - One option is seemingly to copy the complete set of node modules on the host into the source image. A complicating factor is that installed modules are potentially dependent on the host architecture[^2]. That means it might be reliable initially but - as project dependencies evolve - the container build may start being flaky on some host machines that are substantially different from the image os/arch. A potential workaround might be to use npm rebuild^3 to recompile any relevant package binaries but I haven’t yet explored that approach.

  2. Host a Private Package Registry/Proxy - As a pre-build step on the example server, spin up a fresh instance of Verdaccio^4 or similar private registry on the host. Then, add in the local OPR packages as special overrides^5. Finally, alter the image package config to use the new local registry and use the host network option to let the build step reach the private registry on the host^6 followed by cleaning up the private registry. This looks potentially workable when running locally but I haven’t tried it out in part because the added complexity of this seems too high. Another concern is the ability to run this approach in GitHub actions (I haven’t vetted that yet but from a quick glance it looks like it might not be possible).

  3. Copy & Build Entire Project Source in the Container - On the surface, this looks like the obvious best solution. Simply copy in all of the project files from the parent directory while using a subtractive approach to exclude unwanted directories, and run the normal build via lerna during the docker image build. However, this falls down when you consider that docker build file paths can’t escape the context^4 of the current directory without a workaround. Also, this approach, by being subtractive, runs the minor risk of also inadvertently copying local working files that a developer might have placed in the parent OPR directory. A variant is of-course an additive approach but that seemingly suffers a maintenance cost of needing to constantly maintain the example server with a relevant list of file paths to ingest.

  4. Mount the Project Source in the Container - Another option is to mount the whole local project source at runtime into the Docker container. This ends up being tricky because the node modules directory is included and - as noted above - ideally that’s not copied over. There are a few workarounds to explore such as moving the node_modules directory one level above the workspace directory but that seemed like a hassle to solve just for this case.

I'll leave this comment here for a while to solicit feedback. If there are no objections or other suggested solutions I'll open a PR for a cleaned up prototype of the top solution above.

[^1]: A key assumption is that npm builds a physical representation of the dependency tree by checking the existing packages in the node_modules folder and then will not overwrite existing packages. In order for this to be a good solution, this also needs to be true in the future. While I didn’t have time to fully track this down in the related NPM CLI source (ex: https://github.com/npm/cli/tree/latest/workspaces/arborist) I was able to partially test the behavior, and it seems reasonable that it will continue into the future given the performance benefits of not overwriting packages and the challenges of changing core behavior in established platforms. [^2]: For example, see https://github.com/mapbox/node-pre-gyp