Removing the option to load a page with puppeteer in the extract article nodejs script.
This was too slow and brittle to be useful, and requires a chromium browser in the machine to work.
Note for posterity: the slowness could be worked out by having daemon nodejs process with puppeteer already initialized. The brittleness could be worked out by better capturing and displaying errors, and perhaps some retries with exponential backoff instead of a fixed timeout to load the page. But these would mean adding extra complexity for non critical functionality, I'd rather make this a robust entry index/feed than a one-size-fits-all content reader.
Removing the option to load a page with puppeteer in the extract article nodejs script. This was too slow and brittle to be useful, and requires a chromium browser in the machine to work.
Note for posterity: the slowness could be worked out by having daemon nodejs process with puppeteer already initialized. The brittleness could be worked out by better capturing and displaying errors, and perhaps some retries with exponential backoff instead of a fixed timeout to load the page. But these would mean adding extra complexity for non critical functionality, I'd rather make this a robust entry index/feed than a one-size-fits-all content reader.