Open dragospopa420 opened 1 year ago
Hi @dragospopa420,
I think that docker images are not part of this repo. You should probably raise an issue on this repo since if I understand everything correctly that is repository for docker images for apify.
Source code of the image that you are referencing is here: https://github.com/apify/apify-actor-docker/tree/master/node-puppeteer-chrome
Thanks @ivanvs . Thanks @B4nan for transferring the issue Thanks @mtrunkat
I've also had some time to test this image and seems to perform well. Haven't found anything wrong with it.
Thanks, @dragospopa420. The image size is currently something we plan to look into.
CC @fnesveda @B4nan, please take a look
I was asking @vladfrangu to take a closer look last week. IIRC the reason why we use ubuntu was supporting chromium, rest of the browsers should be fine with debian?
I was asking @vladfrangu to take a closer look last week. IIRC the reason why we use ubuntu was supporting chromium, rest of the browsers should be fine with debian?
This image is using Alpine. Chromium works fine on Alpine. Deployed it in some production environments already. From what I see Firefox is in the community repo of Alpine and it works properly.
Super sorry for the late response! The main image that (probably) can't use Alpine is WebKit (Safari). Its good to know that chromium works on alpine, but does chrome work on it too? 👀
I believe the main reason we use Debian in the base images is compatibility with user libraries. Debian uses glibc
, while Alpine uses an alternative libc
implementation, musl libc
, which is not 100% compatible. While musl libc
behaves more correctly according to standards, most software is written targeting glibc
and all its quirks, and could break when used with musl libc
(or would have to be recompiled at least). So I would recommend staying with Debian for these compatibility reasons.
I believe most of the size difference between the image produced by @dragospopa420's Dockerfile and what we have is down to other differences:
node_modules
in the folder, some of which seem unnecessarychrome.deb
download, install and removal should be in the same step ideally)You can use the great dive
tool to inspect the images layer by layer and see what's taking up most of the size.
Which package is the feature request for? If unsure which one to select, leave blank
@crawlee/core
Feature
The base image is based on Debian which has a much bigger fingerprint than the Alpine Linux. So I was thinking maybe the included dockerfile can be based on Alpine Linux, for fast deployment and testing The apify/actor-node-puppeteer-chrome has 2.53gb, my version has 698mb
Motivation
I'm building an infrastructure of spiders based on Crawlee and I wanted to have the fastest possible deployment time.
Ideal solution or implementation, and any additional constraints
Alternative solutions or implementations
No response
Other context
No response