floydspace / serverless-esbuild

💨 A Serverless framework plugin to bundle JavaScript and TypeScript with extremely fast esbuild
MIT License
451 stars 138 forks source link

Regression of #110, package.patterns are ignored #150

Closed kabo closed 3 years ago

kabo commented 3 years ago

It seems #110 is back.

serverless.yml:

functions:
  myFn:
    package:
      patterns:
        - resources/some.txt

Running serverless package with serverless-esbuild v1.10.6 works as expected.

$ cd .serverless
$ unzip myFn.zip
Archive:  myFnl.zip
  inflating: resources/some.txt
  inflating: myFn.js

But with v1.10.7 and onwards it's no longer working, the extra file is missing in the zip.

$ cd .serverless
$ unzip myFn.zip
Archive:  myFn.zip
  inflating: myFn.js
kabo commented 3 years ago

Ah, wait, it's not that simple. I took a look at the commit between 1.10.6 and 1.10.7. The issue is here: https://github.com/floydspace/serverless-esbuild/commit/e504e48220ea910b5e8fd55aef58166e8b14821b#diff-c0b27c44c8eee1df38854f691661101b9de7cfcf5b832817bd6012983d3aa9beR110

In my serverless.yml:

provider:
  stage: test

I then run yarn serverless package -s prod. In 1.10.6 the zip had "prod" in the filename, from 1.10.7 it has "test" in the filename. If I change provider.stage to prod in my serverless.yml it works again. So it's failing to take the commandline option into account, that's the real bug, that the file fails to be included is just a downstream bug from that.

kabo commented 3 years ago

No, wait, the above comment only makes version 1.10.7 - 1.13.1 work. Version 1.14.0 doesn't include the extra file.

olup commented 3 years ago

I can reproduce this if like you I am adding the pattern option on an individual function. However when I set the pattern on the global packaging options it works (and include the file on every function). Could you confirm that on your end ?

kabo commented 3 years ago

Yeah, I can confirm that works. But it's pretty useless when you have one lambda that requires one big file, and you want to keep all the other lambdas in the service that don't need that file as small as possible.

olup commented 3 years ago

Yeah, agreed. But now I can look for a fix ;-)

olup commented 3 years ago

I am working on the bug. Making a note here:

The regression seems to come from #148 - the move to Bestzip. Bestzip works in a different way than what we had previously - most importantly, it does not allow to zip a file from a path to another path inside the zip. the individual patterns or file include relied on the creating of a temporary folder holding the desired files so that each function only get whats it was asking for.

So the trick was to zip tempDir/path/file.xx to path/file.xx inside the archive. But bestzip does not allow this, and let us zip only the files just as they are declared in the file-system.

I am investigating ways to fix this. We could imagine creating another temporary directory and copying all files with the proper directory structure before zipping. What about performances - does it cancel the gains of using bestzip ?

Also while on bestzip case, note that also dotfiles are ignored by wildcards. There is a solution provided by bestzip, but unaware users would be surprised by the behavior, wouldn't they ? Or maybe it is standard ? see https://www.npmjs.com/package/bestzip. @floydspace and @vamche what do you think of all this ?

kabo commented 3 years ago

dotfiles being ignored by wildcards is pretty standard, e.g. if you do cp folderA/* folderB/ it won't include files that start with ..

bestzip is sounding more like mediocrezip :P

But yeah, some extra steps to copy files to where they need to be sounds like a reasonable solution to me. And shouldn't incur any performance loss if there are no extra individual files to package right?

olup commented 3 years ago

bestzip is sounding more like mediocrezip :P

I am also not a fan of bestzip. It was repeatedly proposed as it will use native zip call when available, and small benchmarks has been shown to reduce archiving time by half sometimes. I can't argue with that, even tho the lib is not that well maintained - or so I feel.

But then the path structure limitation comes from this native feature I believe, as on some system the command might not let you reorganize files in the archive.

The performance loss would be related to duplicating again a large node_modukes on disk, because the fragmented nature of it could be a hard task on some machine (I am talking deci-seconds, or a second in the worst cases, but that is bestzip performance gain)

vamche commented 3 years ago

So, there are two ways, how bestzip can zip, nativeZip and nodeZip. If we can confirm the path structure issue limitation is because of nativeZip, we can provide an option in YAML to choose an approach like below, internally the chosen approach will be used for zipping.

custom:
  esbuild:
    zip: node # defaults to native

It's just that I don't want to lose the capability of nativeZip because the zipping is very fast with it as you can see here. FYI, in that benchmarking GitLab CI was using nodeZip since there was no native zip installed, when we installed zip in the container the zip time was reduced even more to < 200 seconds.

kabo commented 3 years ago

The performance loss would be related to duplicating again a large node_modukes on disk, because the fragmented nature of it could be a hard task on some machine (I am talking deci-seconds, or a second in the worst cases, but that is bestzip performance gain)

Ah, ok, the use-cases I have is not duplicating node_modules, in one case I have a certificate that is needed by a lambda, in another case it's a pretty big binary that is needed by one lambda. Just one file in both cases.

olup commented 3 years ago

So, there are two ways, how bestzip can zip, nativeZip and nodeZip. If we can confirm the path structure issue limitation is because of nativeZip, we can provide an option in YAML to choose an approach like below, internally the chosen approach will be used for zipping.

custom:
  esbuild:
    zip: node # defaults to native

It's just that I don't want to lose the capability of nativeZip because the zipping is very fast with it as you can see here. FYI, in that benchmarking GitLab CI was using nodeZip since there was no native zip installed, when we installed zip in the container the zip time was reduced even more to < 200 seconds.

Making this optional is not very useful, and complicated for the end-user. I feel we have to choose one path and do it well. If we decide to support the native option, then duplication is needed. But the performance overhead might not be too bad, I'll make a PR and we can benchmark.

olup commented 3 years ago

Ah, ok, the use-cases I have is not duplicating node_modules, in one case I have a certificate that is needed by a lambda, in another case it's a pretty big binary that is needed by one lambda. Just one file in both cases.

Indeed. But to create a proper folder structure of you package we then have to duplicate every files of it in a specific temporary folder before zipping. That includes the whole node_nodules - albeit only the externals you require for each function, which is hopefully not too much.

github-actions[bot] commented 3 years ago

:tada: This issue has been resolved in version 1.17.1 :tada:

The release is available on:

Your semantic-release bot :package::rocket: