Closed ThallesP closed 1 year ago
I am facing similar issue. @ThallesP Did you find any solution?
@ajaykarthikr couldn't find anything, for some reason, it just can't find the binary, even tho it seems to be in the S3 object.
@ThallesP I managed to make this work, I just used serverless deploy
to directly run it on AWS lambda. It works really well on AWS. If you provide some code snippets, maybe I can help.
@ajaykarthikr Thanks for the response, this is what I'm trying currently to launch the browser:
const browser = await puppeteer.launch({
args: chromium.args,
executablePath: await chromium.executablePath,
headless: true,
});
This information is not sufficient to debug 😅
@ajaykarthikr Here's the full repo: https://github.com/raphaelc484/serverless-webscraping :sweat_smile:
@ThallesP @ajaykarthikr How you guys managed to get it deployed with serverless, I cannot overcome the size limit trying to do so:
Hey @anking I have written a blog about this, explaining how to make it work with serverless.
It might help you. Through it uses chrome-aws-lambda it should work with this library too.
@ajaykarthikr it doesnt say anything about the sizes. The only way I can get this thing deployed now is by using layers
Hey @anking nothing can be done if your build size exceeds 50mb. You just have to suse to layers or try running in a docker image.
I've got it working with serverless/esbuild/layers, some steps/config here in my issue if it helps:
https://github.com/Sparticuz/chrome-aws-lambda/issues/19#issuecomment-1204426373
@ajaykarthikr it doesnt say anything about the sizes. The only way I can get this thing deployed now is by using layers
Make sure that you are not loading puppeteer
or aws-sdk
in your dependencies
but in your devDependencies
Only puppeteer-core
and @sparticuz/chrome-aws-lambda
needs to be in dependencies
(and any other package you use). With this you might be getting down to below the limit...
If not, if you use e.g. serverless deploy --stage test
Serverless does by default deploy through a S3 bucket for you and then you can deploy larger packages!
My package is about 70 MB and it deploys fine using serverless deploy
.
Also upgrade to Serverless v3 as they have rebuilt the complete deploy function and it is way much faster!
Chromium has really grown in size over the years and is pretty much not suitable to be deployed as a package dependency. The reason this package exists is because chromium is like 150 megs and we have to compress it in order to deploy, then it's decompressed on the fly. Some updates to the documentation might be in order. Plus, binary dependencies imo should be layers.
Wow, yes, it's really growing rapidly... I am testing a bit with the latest versions and my last deploy it was 69MB but now has increased to 75 MB!
Using Serverless v3 it's still possible to deploy without using Layers:
chrome-aws-lambda
is around 50megs. If you look in your .serverless
folder after deploy, you'll see the zip file that serverless
uploads, you can extract it and look at that folder using something like tree-size-free
in windows, or ncdu
in linux to see what else it taking up space
chrome-aws-lambda
is around 50megs. If you look in your.serverless
folder after deploy, you'll see the zip file thatserverless
uploads, you can extract it and look at that folder using something liketree-size-free
in windows, orncdu
in linux to see what else it taking up space
Sorry if I was unclear... I meant to say that it is working fine for me, and Serverless v3 still deploys it, even though it is over "allowed size"... I think Serverless v3 is using the "Application" function of Lambdas to get round the size limit...
I think there might be a misunderstanding with the layer maximum allowed size. The max 'upload' size is 50 megs from the AWS' Upload Layer screen, however, you can upload the layer to s3 and then the size limit would be 250 megs uncompressed for all 5 layers. Serverless uses the direct s3 upload option, but if you are uploading it via vanilla AWS, you'll most likely need to upload to s3, then create a layer from that file.
Ah, yes, but as the question was framed as ... with Serverless framework?
I answered from that point of view... :)
It's really amazing how much faster v17 is so a huge THANKS for adding that @Sparticuz ! The Lambda runs about 3 times as fast!
With v14:
With v17:
Wow, that's great! puppeteer 15, 16, and 17 are all using chromium 105, so I wonder if there were puppeteer improvments?
@ajaykarthikr Here's the full repo: https://github.com/raphaelc484/serverless-webscraping 😅
@ThallesP I can see you never got your question answered... I had a look at your repo, and there's a few things in your dependencies you need to sort out for it to work:
"dependencies": {
"@middy/core": "^2.5.3",
"@middy/http-json-body-parser": "^2.5.3",
"axios": "^0.27.2",
// "chrome-aws-lambda": "^10.1.0", // <-- REMOVE, You are not using Sparticuz package
"@sparticuz/chrome-aws-lambda": "^17.0.0", // <-- ADD
// "puppeteer": "^14.4.1", // <-- REMOVE, this should be in "devDependencies" only
"puppeteer-core": "^17.0.0" // <-- Change from ^10.1.0 to ^17.0.0
},
"devDependencies": {
"puppeteer": "^17.0.0", // <-- ADD
"@serverless/typescript": "^3.0.0",
"@types/aws-lambda": "^8.10.71",
"@types/node": "^14.14.25",
"esbuild": "^0.14.11",
"json-schema-to-ts": "^1.5.0",
"serverless": "^3.0.0",
"serverless-dotenv-plugin": "^4.0.1",
"serverless-esbuild": "^1.23.3",
"serverless-offline": "^8.8.0",
"ts-node": "^10.4.0",
"tsconfig-paths": "^3.9.0",
"typescript": "^4.1.3"
},
Then run a npm update
(or to be on the safe side):
rm -r node_modules/*
rm package-lock.json
npm i
I'm trying to use this library with the Serverless framework, but it fails with this error:
Seems that
await chromium.executablePath
is undefined for some reason.Has anyone got this library working with Serverless framework?