tracking-exposed / trex

youtube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling
https://docs.tracking.exposed
GNU Affero General Public License v3.0
52 stars 15 forks source link

fix(shared): replace 'jsdom' with 'linkedom' to prevent the parser from crashing for memory allocation failure #859

Closed ascariandrea closed 1 year ago

ascariandrea commented 1 year ago

Summary

The YT parser keeps crashing due to the huge amount of memory used by jsdom when virtualizing the document for the given entry. Replace it with linkedom seems to solve the issue.

Changelog

fix(shared): replace 'jsdom' with 'linkedom' to prevent the parser from crashing for memory allocation failure

Test Plan

Ensure to at least 50 collected htmls in your db to be sure to stress the parser enough.

docker-compose up -d mongodb
yarn pm2 start platforms/yttrex/backend/ecosystem.dev.config.js # the yt:parser:env is started with max memory of 2GB
yarn yt:backend reset-htmls-processed #reset the htmls.processed to `false` and update the `savingTime`

You can also debug this with vscode by spinning up the parserv:debug process and then attaching the debugger from the proper tab.