ulixee / hero

The web browser built for scraping
MIT License
691 stars 32 forks source link

Error during deployment on AWS Lambda #283

Open tfa346 opened 1 month ago

tfa346 commented 1 month ago

Hi,

I'm tryin to run this project on an AWS instance thanks to a nodejs layer. When i launch the function, the new Hero() is working well. However, on the hero.goto step, this error is launched: "/opt/nodejs/node_modules/@ulixee/hero-core/node_modules/better-sqlite3/build/Release/better_sqlite3.node: invalid ELF header"

Is someone has some kind of workaround about that? Thanks

mehrdad-shokri commented 1 month ago

I tried to to make chrome work in lambda environment but it was no success repos like chrome-aws-lambda didn't work for me as well.(it's updated 3 years ago) so decided to host it on ec2 instead

tfa346 commented 1 month ago

I tried to to make chrome work in lambda environment but it was no success repos like chrome-aws-lambda didn't work for me as well.(it's updated 3 years ago) so decided to host it on ec2 instead

Ok, my problem is I need to run a couple of hundred robots simultaneously so i cant use ec2 for that. I will dig for another solution, thanks for your answer!

mehrdad-shokri commented 1 month ago

why do you need hundred robots at the same time? if you want to scrape 100 websites at the same time you can have probably 10 instances and they can probably handle 100 sites at the same time in a reasonable time. but if you figure out a way to run this on lambda please let me know. i would like to know how to do it. im looking into this seems a promising project

blakebyrnes commented 1 month ago

Hi @tfa346, sorry you're having issues. If you're still looking into this, how did you package your app to upload? That error sounds a bit like a problem with the target architecture of lambda vs the packaged files that got uploaded. FWIW, I haven't tried to make this work on Lambda yet. I would imagine the Chrome issue is going to be bigger (as @mehrdad-shokri pointed out). From what I know, you have to include all the packaged files, and they have (or used to have) a hard limit on size.