Open timocov opened 11 months ago
Thanks for reporting this issue. The reason 7 copies of puppeteer and chromium browser got downloaded is because the puppeteer package you specified has a different version from the version of puppeteer specified by memlab and its sub-modules.
Instead of having puppeteer in the dependencies memlab should use puppeteer-core instead.
puppeteer-core
does not include the chromium browser, if the machine installing memlab does not have a chromium binary it would be a problem.
For a streamlined installation with only a single download of Puppeteer and the Chromium browser, consider utilizing this package.json configuration if you project can use an older version of puppeteer.
{
"private": true,
"dependencies": {
"memlab": "^1.1.40",
"puppeteer": "^13.5.1"
}
}
To save disk space for now, you can also delete the other 6 copies of puppeteer:
rm -rf ./node_modules/@memlab/*/node_modules/puppeteer
rm -rf ./node_modules/memlab/node_modules/puppeteer
A better solution would be to upgrade the puppeteer used by memlab
to the latest version. At the moment, memlab is using an older version of Puppeteer because the code is shared and used for internal testing at Meta and there are certain factors that lead to the use of the older version for now. I'll attempt an upgrade.
A better solution would be to upgrade the puppeteer used by memlab to the latest version. At the moment, memlab is using an older version of Puppeteer because the code is shared and used for internal testing at Meta and there are certain factors that lead to the use of the older version for now. I'll attempt an upgrade.
@JacksonGL Unfortunately while this solution would help in this particular example, it doesn't solve the problem itself. If for some reason you would have a different puppeteer version in your package.json
(for any reason), you will get this issue again.
puppeteer-core does not include the chromium browser, if the machine installing memlab does not have a chromium binary it would be a problem.
Yes I know that. I'm not familiar with the code, but I really doubt that every single package in the library requires to have puppeteer
with the browser over puppeteer-core
, apart from the root memlab
package. If it is supposed to add memlab
package to install this library (i.e. without adding any @memlab/*
package manually to the package.json
) you can try to use puppeteer-core
in any "helper" package and use puppeteer
in the root package that runs the library. In this case only 1 puppeteer
package would be in the tree and only 1 additional browser would be downloaded.
every single package in the library requires to have puppeteer with the browser over puppeteer-core, apart from the root memlab package
Maybe memlab
isn't the package as it might be just a CLI, but it should be the one that actually runs a puppeteer (but ofc memlab
can pass its puppeteer path to any sub-package).
@timocov Having a single package that encapsulates all puppeteer calls is a good idea. I will upgrade puppeteer which was also planned for other reasons, and move things around for the "helper" package later when I get a chance.
If for whatever reason in your repo you're using puppeteer different version than memlab is using, you can ended up with having 7 downloaded chromium instances, which uses around 2.5gb of disk space instead of less than 500mb, and takes time to download all of them (even though they all are the same):
Reproduction
Use this
package.json
and runnpm install
in the folder with it:Possible solution
Instead of having
puppeteer
in the dependenciesmemlab
should usepuppeteer-core
instead.