Open ryanblock opened 5 years ago
I'm thinking said fix (whether it originates from lambci/git-lambda-layer, or here) may be occasion for a 1.0 bump – open to thoughts on that!
Definitely makes sense to me :+1:. There's no specific reason we're not on 1.0 yet, just an initial abundance of caution, bumping sounds like the right idea.
I think I partially misunderstood AWS's Lambda container rollout plan, which can be summarized as:
nodejs10.x
Lambdas: run AWS Linux 2, which has breaking changes with the lambci/git-lambda-layer
; presumably AWS Linux 2 will be the container env for future Lambda runtime releasesAMI 2018.03
, which should not (in theory and early testing) have any breaking changes with the lambci/git-lambda-layer
What's not yet clear is whether a git package that bundles all its own dependencies will be compatible with both container environments (and assuming it was, whether it would be a tenable file size).
So, this begs a question of direction. Since there's now effectively a fork in Lambda environments, do we maintain a version for each?
I could see doing a minor bump right away with the latest git currently available in lambci/git-lambda-layer
(2.20.0
), putting pre 1.0 into bug fixes / maintenance mode; then bumping major for AWS Linux 2 compatibility, due to its a breaking changes for nodejs10.x
.
That's tidy, but could be frustrating for anyone installing lambda-git@latest
that isn't on nodejs10.x
(or future AWS Linux 2 Lambda runtimes).
Fwiw, sounded a bit like mhart was going to build an entirely new layer to solve for this, but layers are just ARNs, so they don't really have the NPM (and soon GH Registry) namespace considerations of this module.
Can we easily detect the AWS Linux version in use from Node at runtime? By the sound of it, for now the Node version is a clear marker, though it's unclear if that'll always be true.
Given we can, I'd split the package by versions as you suggest, but leave a check in each that warns and exits hard if it's run in the wrong environment, and then leave the npm latest
tag pointing at the one we think is most popular for now. And document all that of course.
E.g, assuming in the short-term that AWS Linux 1 is still the common use case, I'd publish:
1.0.0
<-- up to date version of everything for Linux 1, with a warning if run in Linux 22.0.0-beta
<-- up to date version of everything for Linux 2, in beta until we're sure this all works (your call), with a warning if run in Linux 1with the latest
dist tag pointing to 1.0.0
for now, presumably, so that's what's installed by default.
At some point later on we can finalize 2.0.0 if everything looks good, and start pointing latest
to that once we think most people are using AWS Linux 2 (I suspect you have a far better feel for that than me).
How does that sound? Potentially frustrating for anybody on the wrong side of our latest
guess if they don't read the docs, but if they don't read the docs that's always going to be true, and at least clearly printing an error at startup can make it very easy to spot & resolve.
News: Node 10 git binary is a go! https://github.com/lambci/git-lambda-layer/issues/13#issuecomment-547484502
What with node8.10
about to go EOL on Lambda, I'm thinking there is no longer much point to updating the legacy version or splitting the package / providing backwards support – within a few months those Lambdas won't be able to be updated anyway.
Seems like the move here is to get the new git image into 2.0.0 RC for nodejs10.x
and go from there. Thoughts?
What with node8.10 about to go EOL on Lambda, I'm thinking there is no longer much point to updating the legacy version
Just for my reference later, some dates from https://docs.aws.amazon.com/lambda/latest/dg/runtime-support-policy.html:
All sounds good - the hard deprecation here makes a convincing case to me too for just focusing on Node 10 as the default asap, along with a simultaneous bump to v1.0.0.
Definitely keeps things simpler too, works for me :+1:
Right on. Will get a PR cooking for this this week!
Ok! Got a PR going for 2.0.0
here (I just assumed to go to 2.0.0 mainly because I didn't take the time to figure out whether 0.x (which is unstable in semverland) → 1.x would be considered a major the same as 1.x → 2.x). I added optional callback support, improved error handling, a new test suite, GitHub Actions CI config, and, of course, updated git
for nodejs10x
.
Unfortunately it's not good news. I probably should have tested lambci/git-lambda-layer
's layer binaries before I got as far as I did with this PR, because there appear to be some blockers that I'm not so sure we can get around.
So, a quick bit of background: the nodejs8.10
's Lambda microcontainer image was/is about 1.2GB, and contained a LOT of what we think of as being "standard" Linux stuff in there. Binaries, tools, libs, etc.
When AWS released nodejs10.x
, they took a pretty different approach to the image, and made it super lean. This meant removing a ton of shared Linux libraries, binaries, and stripping it down to the bone – and they got the whole thing down to about 400MB (which is seriously impressive weight loss, imo!).
This is (in theory) a major boon to cold start times and other operations, but unfortunately it also meant that any custom binaries could no longer rely on standard system libraries. No bigs, being the mensch that @mhart is, he built a lot of great tooling (like docker-lambda and more recently yumda) to make it easier to compile binaries for Lambda targets.
However, the assumptions of extracting binaries to the filesystem via userland / Node modules vs via Lambda layers are different. Runtimes do their userland work in Lambda within /tmp
, while Layers are contained within /opt
. When I extracted the nodejs10.x
git
package and began testing in real Lambdae, git commands beyond --version
were failing to find libs.
Looking into some of the binaries, it appears they may be calling libs from an absolute path that assumed a layer (/opt
), and not relatively to their location in the filesystem (which, although it's been a while since I've done Linux lib hacking, I believe is a totally normal phenomena – but please correct me!).
This wasn't an issue before because all the common libs needed to run these binaries existed on the filesystem in nodejs8.10
.
Just for completeness, I was curious to see if we could extract to /opt
, but it is not userland editable, so this module can't write its contents there:
Uncaught Exception
{
"errorType": "Error",
"errorMessage": "EROFS: read-only file system, open '/opt/._bin'",
"code": "EROFS",
"errno": -30,
"syscall": "open",
"path": "/opt/._bin",
"stack": [
"Error: EROFS: read-only file system, open '/opt/._bin'"
]
}
Unless anyone can think of other approaches, it seems like that means we'd need to take on compiling git just for this module in order to make it work. In terms of level of effort vs. wheel reinvention vs. other subject considerations (related to my own use cases), I don't think I'd recommend that, and don't personally want to take on that workload.
I'm a bit bummed out, really thought I could make this happen while still drafting on the good work of the lambci/git-lambda-layer
project, but I think the way nodejs10.x
works, and the fact that node8.10
is nearly EOL together puts this project at an impasse.
@pimterry wdyt?
Interesting! Yeah, that sounds like a pain. Can you paste the errors you're seeing here, and the output of ldd $thegitbinary
after extraction? That should highlight the issue a little more clearly.
Looking into some of the binaries, it appears they may be calling libs from an absolute path that assumed a layer (/opt), and not relatively to their location in the filesystem (which, although it's been a while since I've done Linux lib hacking, I believe is a totally normal phenomena – but please correct me!).
I think absolute paths are common, but by no means required. You can have relative paths (although they're relative to your working directory, not the binary) or bare names (which will be looked up using LD_LIBRARY_PATH
etc). There should be a route through with those.
Unless anyone can think of other approaches
You are right: the 'correct' solution is probably to set up our own pipeline to compile Git for Lambda, for userland instead of layerland, but I agree that's a hassle.
I think there are other options though. For example, a quick google turns up this fun SO question: https://unix.stackexchange.com/questions/448789/fix-hardcoded-dynamic-linking-for-executable.
They're resolving a very similar issue using patchelf, which lets you to rewrite the paths in the binary directly. I think this should look something like:
ldd $gitbinary
- get the list of the linked librariespatchelf --replace-needed $bad_absolute_path $a_better_path $gitbinary
In theory we could do that fairly easily, although it might just expose a new problem. I don't really have time to dig into this myself right now, but if you've got a minute, do you want to give it a go?
You guys got this! 👀 this thread 👍
Just a reminder for anyone lurking here that doesn't know – you can get git
working in your Lambda by just adding a layer: https://github.com/lambci/git-lambda-layer
If that's not an option for you, then I'd understand why you'd want to continue using this module, but just figured I'd chime in to make sure ppl know.
Thanks for that @mhart - I will check it out. The layer is new right? (~17 days ago the support was added). I think that is why I used this one a while back. We don't have any restrictions for adding the layer in on our side (just need to spend a few minutes to update the cloudformation). May be a good idea to add that to the README for folks on that repo.
@mhart for sure! At the time I bugged @pimterry to undeprecate this package we couldn’t yet make use of layers.
@pimterry re patching the elf and repackaging the binaries, sounds p straightforward and easy to automate (assuming it works), but I’m also tapped out on time right now. Not sure when I’ll be able to get to it, so if someone wants to pick up where I left off on the 2.0 branch (or if you opt to re-deprecate this package) either is fine imo!
@dorian-coygo the layer that supports git on all runtimes except nodejs10.x
has been around for a year, but you're right that nodejs10.x
support is only 17 days old. Not sure what you mean about adding to the README?
@mhart - I went down the rabbit hole this morning. @ryanblock is right about how slim AWS Linux 2 is. It doesn't have curl
or even tar
. I tried forking the layer and adding curl
and publishing it to my AWS account, but it looks like there are more issues and headaches (I get a curl: error while loading shared libraries: /opt/lib/libcurl.so.4: file too short
). So then I decided to just use the request
package and ditch curl. That is when I ran into the tar
command not found. Solved that by adding it to the layer, but now I am getting random tar: Error is not recoverable: exiting now
(which requires me to either debug my simple request
code or some issue with tar
.) There isn't a lot of documentation around AWS Linux 2 and not all runtimes use it.
I may just wait to see if this package has more success or just write my lambda in a different language that still uses AWS Linux image.
As for the README, you explain how to use the AWS Layer via the UI. You could add a snippet about how to add it to the CloudFormation such as
Resources:
<name of lambda>:
Type: AWS::Serverless::Function
Properties:
CodeUri: <location of lambda code>
Layers:
- !Sub arn:aws:lambda:us-east-1:553035198032:layer:git-lambda2:2
@dorian-coygo if you want any of those binaries (curl, tar, etc), you can use https://github.com/lambci/yumda
Are you having issues with the git-lambda2
layer? If so, feel free to post issues over at that repo
Btw, in terms of explaining how to use it in the UI, I have exactly an example of that in the README here: https://github.com/lambci/git-lambda-layer#getting-started
Just wanted to give an update that I also moved to layers solution and used the git-lambda-layer as a guide (since it didn't have all the dependencies I needed).
Since AWS slimmed down AWS Linux 2, I don't think the lambda-git would be good enough for most people since you will probably have to build a layer of dependencies anyways.
Steps for others:
git, curl, tar, gzip
Btw, in terms of explaining how to use it in the UI, I have exactly an example of that in the README here: https://github.com/lambci/git-lambda-layer#getting-started
FWIW: I was referring to adding a CloudFormation snippet in the readme (in case someone didn't want to do it manually in the UI).
See also: https://github.com/lambci/git-lambda-layer/issues/13
Changes in Lambda nodejs10.x appear to break this library; specifically, its build of git, which is based on
lambci/git-lambda-layer
's build (hence the issue above).I'll continue to monitor progress on the next git distribution for the AWS Linux 2 image that Lambda nodejs10.x uses.
Related, I'm thinking said fix (whether it originates from
lambci/git-lambda-layer
, or here) may be occasion for a1.0
bump – open to thoughts on that!Edit: misunderstood rollout plan/schedule of AWS Linux 2 – while it's a breaking change for this library in Lambda nodejs10.x, it's not going to be forced on non-nodejs10.x users (yet, anyway!).