Open dpolivy opened 4 years ago
I would expect
rush deploy
to be able to produce output that could be deployed without relying on the symlinks.
This is technically impossible without changing to a fundamentally different installation strategy, and reintroducing the phantom dependency and doppleganger problems are solved by symlinks. The right way to approach this problem is to figure out how to get Azure App Service to create the symlinks.
The rush deploy
command already supports a "linkCreation": "script" setting that can defer the symlink creation until after the archive is unpacked. For example, a simple workaround might be for the application itself to invoke the create-links.js script automatically during initial startup.
But ideally we should figure out a best practice for Azure App Service and document it. We could also consider proposing an improvement for the Azure App Service feature to better support symlinks. (For example, the .zip files created by Rush's --create-archive
parameter are already capable of storing symlinks, although many unzipping tools do not support that.)
@dpolivy Are you able to do some research into these options?
The challenge here is that the Run from Package deployment doesn't actually unpack the files. It mounts the ZIP directly to the filesystem and runs it from there. There are a few benefits of doing it this way which you can read about in the documentation. So there's no way to fix up the links after the fact with this approach. It might be possible if I switch back to the traditional deployment model, but that is not an ideal solution.
(For example, the .zip files created by Rush's
--create-archive
parameter are already capable of storing symlinks, although many unzipping tools do not support that.)
This is interesting, I did not realize that. I am currently using Azure DevOps ArchiveFile task, which uses 7zip v16, which does not preserve links. I'll take a look at this and see if it solves the problem, though. The real question is whether whatever tool Azure is using to mount the archive supports them as well. If I need additional directories included into the zip, is there any easy way to do so (can the deploy
and --create-archive
steps be run separately so I can modify the deploy directory contents first)?
This is technically impossible without changing to a fundamentally different installation strategy, and reintroducing the phantom dependency and doppleganger problems are solved by symlinks.
While I understand the philosophy here, when packaging for a production deployment, is it as important to keep these boundaries intact? If we solve the problems in the development scenario, would it be ok to build an optimized production "build artifact" that works without symlinks even if it potentially allowed for phantom dependencies?
I'm definitely interested in solving this problem so I can move on to other work, so happy to try and dig in where I can. As I'm new to rush and pnpm, etc, I'm not as familiar with the internals and nuances, so appreciate any guidance and assistance you can provide. Also, are there specific Microsoft folks associated with rush who might be able to assist and/or engage the internal App Service teams for discussion?
(For example, the .zip files created by Rush's
--create-archive
parameter are already capable of storing symlinks, although many unzipping tools do not support that.)
I was able to give this a try, and unfortunately, it seems that the links are not handled properly when the ZIP is mounted into App Service. The links just show up as files with the content being the target of the link. So seems like this approach isn't going to work ☹
I see. You could probably workaround this for many cases by using webpack to bundle your app, so that there are no node_modules
dependencies.
We could probably also come up with a way to use a different installation plan (e.g. run pnpm install --shamefully-flatten
in the deployment folder), but that would be deploying something different from what you tested during development.
Fundamentally however, these limitations of the "Run from Package" feature seem maybe too restrictive for a professional deployment strategy that we could generally support. Do they have any other options you could use instead?
I'm haven't used webpack, so I'm not familiar with how it works on server-side code, especially when using modules with native node components (sharp, edge.js).
I wasn't aware of shamefully-hoist
(the new name for shamefully-flatten
) as it's not linked in the pnpm install
doc page. It might help, but not if it's only putting the dependencies in common/temp/node_modules
instead of an actual root node_modules
or the app node_modules
directories.
Would using regular npm
be a better option, since it doesn't necessarily need to symlink the same way pnpm
does? I guess I don't really understand why building a production release that is the equivalent of an npm ci
flat node_modules install is such a bad thing? I get that there are issues in "correctness", but if those are maintained throughout development, isn't it safe to structure things a little differently in production (that seems to be the rationale behind shamefully-hoist
) if the deployment model doesn't support symlinks? I don't use containers, but I believe there are some other scenarios where symlinks are not desired.
Fundamentally however, these limitations of the "Run from Package" feature seem maybe too restrictive for a professional deployment strategy that we could generally support. Do they have any other options you could use instead?
Yes, there is also Deploy from ZIP. This would potentially offer the ability to fix the symlinks after the files are unzipped in the source. We used to use this deployment method, but found it would take 30-40 minutes to complete given the number of files we had in node_modules. When we switched to Run from Package, the deployment was complete within 2 minutes. So a pretty significant time savings for us, in addition to the other benefits of running from package (see docs):
- Eliminates file lock conflicts between deployment and runtime.
- Ensures only full-deployed apps are running at any time.
- Can be deployed to a production app (with restart).
- Improves the performance of Azure Resource Manager deployments.
- May reduce cold-start times, particularly for JavaScript functions with large npm package trees.
I'm willing to revisit this for the sake of experimentation, but my desired goal would be to run from package.
There is also one other similar scenario, whereby an app would need to have a fully self-contained node_modules
directory under it -- Azure WebJobs. When run, the app directory itself it copied to a random directory in a temporary (local) filesystem on the machine running the job, and therefore the app directory must itself contain all dependent modules (and their dependencies). If this is a separate task, I don't want to conflate that here, but it's another problem I need to address.
I'm haven't used webpack, so I'm not familiar with how it works on server-side code, especially when using modules with native node components (sharp, edge.js).
Some Node.js tools are packed into a single bundle. For example, yarn
comes as a single .js file. But it will run into trouble with native dependencies and any libraries that probes around in the node_modules folder without using require()
. So Webpack is an approach that only works in certain well-behaved cases.
Would using regular
npm
be a better option, since it doesn't necessarily need to symlink the same waypnpm
does?
pnpm --shamefully-hoist
is functionally equivalent to NPM. If we build this into Rush, I'd approach it using PNPM so we can use pnpmfile.js and other PNPM-specific features that will make this easier.
I get that there are issues in "correctness", but if those are maintained throughout development, isn't it safe to structure things a little differently in production (that seems to be the rationale behind
shamefully-hoist
) if the deployment model doesn't support symlinks?
Read about doppelgangers for example. When these problems arise, they don't have easy solutions. For small installation scenarios (which may very well include deployments), if those problems don't arise, then everything just works fine and nobody understands what we're talking about regarding "correctness". :-) Whereas if you eventually do encounter those problems, they can be thorny.
We used to use this deployment method, but found it would take 30-40 minutes to complete given the number of files we had in node_modules. When we switched to Run from Package, the deployment was complete within 2 minutes. So a pretty significant time savings for us
The 30-40 mins time makes sense if you zipped up the entire monorepo installation footprint. However rush deploy
is supposed to carve out a relatively small subset of files needed by the deployed app (excluding devDependencies
in particular). I'm curious to hear if this timing is much smaller for a zip file created by rush deploy
.
Based on our chat, I think you've provided reasonable technical motivation for us to consider a rush deploy
mode that works like npm ci
.
The resulting common/deploy
output should be equivalent to something like this:
common/deploy
pnpm install --shamefully-hoist
to install all the dependencies (from the temporary NPM registry), using the older node_modules
model that avoids symlinksWe wouldn't actually implement it using a temporary NPM registry -- the above is just a behavioral spec.
But our solution could be close to that... 🤔 For example, maybe we could create a temporary pnpmfile.js
that redirects PNPM to look in local folders instead of the NPM registry, for local Rush projects.
As mentioned before, this mode has some downsides:
The legacy NPM installation model has technical flaws that are sometimes quite painful to deal with (e.g. doppelgangers). This is the main motivation for PNPM's innovation of using symlinks.
If you deploy using a different installation model from your original monorepo, it means you may be deploying something different from what you tested. Versions might be incorrect, or peer dependencies might not be satisfied the same way, etc. Troubleshooting that might be tricky.
Thus we would NOT make this the default or recommended mode. It would be an optional mode for scenarios like yours.
How's that sound? The next step would be to fiddle around with pnpmfile.js
manually, and see if we can produce a common/deploy folder that way. If it works, then we could look at making this a Rush feature.
I have a little good news/bad news to report.
The 30-40 mins time makes sense if you zipped up the entire monorepo installation footprint. However
rush deploy
is supposed to carve out a relatively small subset of files needed by the deployed app (excludingdevDependencies
in particular). I'm curious to hear if this timing is much smaller for a zip file created byrush deploy
.
Good news: with the modified rush deploy
output (including my legacy non-Node app files), the ZIP Deploy method seems to take about 10-12 minutes, at least on the very small sample size of 2 runs I've done so far. So that's a nice improvement. However, ultimately, I am unable to create the symlinks after the files have been copied. Upon some further research, it seems symlinks are not supported in Azure App Service, at all.
Sadly, it seems this is just not at all possible with Azure Web Apps. Some of the answers reference the ability to follow an existing symlink, but it's unclear to me if that is just a system-generated one or if there is any path to creating one by a site owner. I've tried using the archive generated by rush deploy --create-archive
, but it just results in files with targets as the text instead of actual links. I will look into opening a support ticket to see if I can glean more information.
I'll also start fiddling around as you suggest above to see if I can hack something together as a PoC.
I'll also start fiddling around as you suggest above to see if I can hack something together as a PoC.
I would suggest to set "useWorkspaces": true
in your rush.json file, since that will become the default installation model in the next major release of Rush. And the pnpmfile.js fixups will be somewhat different (and actually easier) in that model.
@dpolivy I experimented with this idea a bit myself. I was able to get PNPM to remap the workspace:
specifier using a pnpmfile.js like this (testing with the rush-example monorepo with "useWorkspaces": true
):
'use strict';
module.exports = {
hooks: {
readPackage
}
};
/**
* This hook is invoked during installation before a package's dependencies
* are selected.
* The `packageJson` parameter is the deserialized package.json
* contents for the package that is about to be installed.
* The `context` parameter provides a log() function.
* The return value is the updated object.
*/
function readPackage(packageJson, context) {
console.log('TRACE: ' + packageJson.name);
function fixup(dependencyTable) {
if (!dependencyTable) {
return;
}
for (const dependencyName of Object.keys(dependencyTable)) {
const versionSpecifier = dependencyTable[dependencyName];
if (/^workspace:/.test(versionSpecifier)) {
debugger;
let newSpecifier = '';
switch (dependencyName) {
case 'my-controls':
newSpecifier = 'file:../../libraries/my-controls/';
break;
case 'my-toolchain':
newSpecifier = 'file:../../tools/my-toolchain/';
break;
default:
throw new Error('Unknown workspace reference to "' + dependencyName + '" for "'
+ packageJson.name + '"');
}
dependencyTable[dependencyName] = newSpecifier;
}
}
}
fixup(packageJson.dependencies);
fixup(packageJson.devDependencies);
fixup(packageJson.optionalDependencies);
fixup(packageJson.peerDependencies);
return packageJson;
}
And I used this command line for installing:
pnpm install --prod --shamefully-hoist --package-import-method=copy --no-lockfile --prefer-offline
However, even if with --package-import-method=copy
, PNPM still seems to create symlinks in the node_modules folder.
@zkochan is there any way to make PNPM install without symlinks, i.e. the installation model used by Yarn classic and NPM? If not, we might need to use Yarn for this.
@zkochan is there any way to make PNPM install without symlinks, i.e. the installation model used by Yarn classic and NPM? If not, we might need to use Yarn for this.
no, the whole point of pnpm is its unique node_modules structure that is made possible by symlinks. So we only support symlinks. We will never support a flat node_modules without symlinks. We might support Yarn's Plug'n'Play, which doesn't require symlinks because it overrides Node's resolution algorithm.
Would be nice if Node supported something like "fake symlinks". pnpm would create just some text files instead of a symlink and Node would use them to resolve the real location's of packages. Maybe we can create an issue at NodeJS.
@zkochan I am trying to understand two things here:
--shamefully-hoist
require symlinks at all -- isn't it essentially reproducing NPM's algorithm?--package-import-method=copy
? The docs make it sound like it avoids creating symlinksThanks!
why does --shamefully-hoist require symlinks at all -- isn't it essentially reproducing NPM's algorithm?
it is not reproducing npm's algorithm. It is reproducing npm's flat node_modules, using symlinks.
what is the purpose of --package-import-method=copy? The docs make it sound like it avoids creating symlinks
it has nothing to do with symlinks. It uses copying instead of hard linking.
After giving it more thought. We don't even need changes in NodeJS. We may try overriding the implementation of fs.readlink to make it understand the "fake symlinks".
Would be nice if Node supported something like "fake symlinks". pnpm would create just some text files instead of a symlink and Node would use them to resolve the real location's of packages. Maybe we can create an issue at NodeJS.
@zkochan This is a fascinating idea. However it seems to require hooking every core API that interacts with file paths, not just require()
. For example, any of fs.copyFile()
, child_process.exec()
, etc might be invoked to open a path that passes through a virtual symlink. These APIs do not call fs.readlink()
internally, but instead wrap core OS APIs that internally traverse the filesystem. Also if Node.js spawns child processes, then the monkey patch would somehow need to be enabled for them as well.
@octogonz It seems there is a new @pnpm/make-dedicated-lockfile
tool that can generate a lockfile for a specific subset of a workspace. That, combined with pnpm PnP should allow rush deploy
to generate a deployable structure without symlinks for folks like me who need that. Do you think that rush deploy
could be updated to support this approach?
See https://github.com/pnpm/pnpm/issues/2198#issuecomment-710882357 for more details.
If PNPM provides the technology to solve this, certainly we would incorporate that into rush deploy
.
I tried @pnpm/make-dedicated-lockfile
but wasn't able to get it working. The CLI is a thin wrapper around this API:
make-dedicated-lockfile/src/index.ts
export default async function (lockfileDir: string, projectDir: string) {
const lockfile = await readWantedLockfile(lockfileDir, { ignoreIncompatible: false })
if (!lockfile) {
throw new Error('no lockfile found')
I tried calling it in the Rush Stack repo with lockFileDir="<repo>/common/temp"
and projectDir="<repo>/apps/api-extractor"
and it deleted the node_modules
folder and then printed this error:
> (node:8472) UnhandledPromiseRejectionWarning: Error: Cannot resolve workspace protocol of dependency "@microsoft/api-extractor-model" because this dependency is not installed. Try running "pnpm install".
at makePublishDependency (C:\Users\Owner\AppData\Roaming\nvm\v12.18.4\node_modules\@pnpm\make-dedicated-lockfile\node_modules\@pnpm\exportable-manifest\lib\index.js:64:19)
at async C:\Users\Owner\AppData\Roaming\nvm\v12.18.4\node_modules\@pnpm\make-dedicated-lockfile\node_modules\@pnpm\exportable-manifest\lib\index.js:53:9
at async Promise.all (index 0)
There are no docs in any of this code, but it seems that maybe:
node_modules
folder that it operates on, which would be badBTW @dpolivy you might also want to first verify that (1) your project actually works with Plug'n'Play -- many do not, and (2) your target runtime supports Plug'n'Play, for example some way to invoke .pnp.js
before the app boots up.
@dpolivy the .zip file format does have a spec for storing symlinks. So it might be worthwhile at least to create a ticket asking for Azure App Service Run from Package to support these symlinks when mounting .zip
file.
@octogonz Thanks for giving it a shot. I'm not entirely sure how it's supposed to work, but maybe @zkochan can offer some suggestions on how to utilize it in this scenario?
And yes, I did test my app with PnP when I was using rush, and it seemed to work OK. It is possible on App Service to specify the command line for invoking your node app, which allows one to insert the parameter to get it to load the .pnp.js
file. The challenge I had, which I think make-dedicated-lockfile
is intended to solve, is that the .pnp.js
I used originally was for the entire repo, when I'd much prefer it to be specific to each "deployed app" (project). As far as filing a feature request on App Service, I did pass that along but I'm not holding my breath waiting for it to happen...
@hbo-iecheruo and I encountered this same problem today with AWS Lambda services. Unlike with Azure App Service's Run from Package, the .zip file gets extracted rather than being mounted as readonly disk volume. But there is no lifecycle step where symlinks can be created, so the requirements are exactly the same.
In https://github.com/pnpm/pnpm/issues/2198#issuecomment-710882357 the conclusion for PNPM was:
So to summarize.
In order to deploy a project from a workspace use @pnpm/make-dedicated-lockfile
If the environment that you are deploying to doesn't work with symlinks well, or it does not support symlinks, then use then use Plug'n'Play, which is shipped with pnpm v5.9. Create the next
.npmrc
in the root of your project:node-linker=pnp symlink=false
But we can add a few observations:
rush deploy
design is geared for a full-fledged Node.js service which may have a complex node_modules
topology. The goal is to "ship the exact thing that we tested" during development, avoiding transformations like npm install
hoisting that could introduce regressions.I'd like to propose that rush deploy
should support a special symlink-free mode for these "lightweight" deployment scenarios. We could impose some simplifying restrictions, for example maybe deploymentProjectNames
cannot specify multiple projects.
rush deploy
could do a PNPM Plug'n'Play installation, even though the monorepo is not using Plug'n'Play.
Hypothetically, suppose you did these steps manually:
common/deploy
folder, along with the monorepo's pnpmfile.cjs
pnpm install
in that folder, doing a Plug'n'Play installation without any symlinks.Step 3 could use npm install
or yarn install
equivalently, but choosing PNPM has the advantage of supporting PNPM-specific features such as pnpmfile.cjs
.
The actual implementation would not really need Verdaccio. Instead it would simply rewire PNPM somehow to install directly from the local folders, producing the same outcome.
As you know, I fully support this 👍
I'd like to propose that rush deploy should support a special symlink-free mode for these "lightweight" deployment scenarios. We could impose some simplifying restrictions, for example maybe deploymentProjectNames cannot specify multiple projects.
One of my scenarios in a monorepo is that I have multiple Node.js apps that share common modules, but ultimately get packaged and deployed separately. And also some Node.js apps that get packaged and deployed together. If you do add this functionality, it would be great if these scenarios were both supported.
any updates here?
I'm keen to use Rush on a new project, but am currently stuck if I can't create an asset without using symlinks as the application is being deployed as an AWS Lambda (we'll almost definitely hit the same scenario with Azure). I'm happy to test this as soon as anything is ready to go.
No idea if this works with Rush, but pnpm 6.25.0 now has a new configuration option node-linker=hoisted
which can be added to .npmrc
.
Last time I tried using React Native & @rnx-kit/metro-resolver-symlinks with Rush there were still issues, but hopefully this may resolve it.
Any update about that subject?
Currently facing the same issues about WebApp deployments on Azure as the RUN_FROM_PACKAGE
can only be done without symlinks, unfortunately.
Being able to provide a flat and ugly hoisted node_modules should do the job to bypass systems not supporting symlinks (putting aside all doppelgangers and drawbacks of not using symlinks that we'll have to assume at some point).
There might be a workaround using directly pnpm
but it would be cool to have that option directly from Rush.
I was able to deploy aws lambda service from pnpm monorepo. https://github.com/UROjQ6r80p/pnpm-aws-monorepo/
Also another user mentioned that possibility here: https://github.com/pnpm/pnpm/issues/6259#issuecomment-1712158649 I did not find any information from AWS about that, nothing in here about symlinks: https://docs.aws.amazon.com/lambda/latest/dg/lambda-releases.html
https://pnpm.io/npmrc states: Some serverless providers (for instance, AWS Lambda) don't support symlinks
Did I overlook anything? Do I understand it was not possible before on AWS Lambda?
git clone https://github.com/UROjQ6r80p/pnpm-aws-monorepo
cd pnpm-aws-monorepo
pnpm install
cd services/aws-lambda
pnpm --filter=aws-lambda --prod deploy dist
cd dist
zip --symlinks -r dist.zip ./
dist.zip
to aws lambda.No node-linker=hoisted
, default pnpm config used.
No unneccessary modules from other packages bloating your lambda.
Lambda:
git clone https://github.com/UROjQ6r80p/pnpm-aws-monorepo
cd pnpm-aws-monorepo
pnpm install
cd services/aws-lambda
pnpm --filter=aws-lambda --prod deploy dist
zip a -snl -ttar dist dist/
will be saved to dist.tar
dist.tar
to Linux system. I use WSLtar -xvf dist.tar
cd dist
zip --symlinks -r dist.zip ./
dist.zip
to AWS Lambda.
Please prefix the issue title with the project name i.e. [rush], [api-extractor] etc.
Is this a feature or a bug?
Please describe the actual behavior.
rush deploy
is great, however it has a fundamental incompatibility with deployments that don't support symlinks, such as Azure App Service's Run from Package. In this scenario, zipping up the deployment directory creates copies of all the files referenced by symlinks, but it means that only the main dependencies of each app are able to be resolved -- any dependencies of those dependencies cannot be resolved, as that is dependent on resolving based on the symlink'ed location of the module. Therefore, apps that are packaged directly are unable to run without symlinks. Unfortunately, the Run from Package feature does not support TAR files (which do support symlinks).What is the expected behavior?
I would expect
rush deploy
to be able to produce output that could be deployed without relying on the symlinks. One thought here is that thecommon\temp\node_modules\.pnpm\node_modules\
directory could be symlinked/located in the root, which might solve the problem, although in a ZIP of the deploy directory would lead to quite a bit of duplication and bloat. If there were a way to just have all of the required node modules for production (including local projects) stored in a flat structure in the root of the deploy folder, that would be ideal, as then there would be only a single copy of each module in the deployment.If this is a bug, please provide the tool version, Node.js version, and OS.