cloudfour / lighthouse-parade

A Node.js command line tool that crawls a domain and gathers lighthouse performance data for every page.
MIT License
357 stars 14 forks source link

Output referrer when there is a crawl error #127

Closed exortech closed 1 year ago

exortech commented 1 year ago

Resolves #126

changeset-bot[bot] commented 1 year ago

🦋 Changeset detected

Latest commit: 5aee276f17d1c3bfbb9dd02ad9e63e6fd80abde4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package | Name | Type | | ----------------- | ----- | | lighthouse-parade | Patch |

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

calebeby commented 1 year ago

Hey @exortech, I am planning to release a new major version pretty soon (the next few days), on the next branch. Any chance you could reimplement this but targeting the next branch?

calebeby commented 1 year ago

Thanks for taking the time to send a PR by the way!

exortech commented 1 year ago

Sure. No problem.

I have another change that I'd like to propose, which is to expose crawler.parseScriptTags as a configuration parameter. The built-in parser for simplecrawler is pretty basic and generally does a poor job of trying to pull uris out of script tags. This creates a lot of false positives, especially if I'm trying to also use simplecrawler to detect broken links. Changing the code to stop parsing script tags would be simplest, but would break backwards compatibility. So the intention is to make disabling script parsing configurable.

What do you think? Does that align with functionality that you would want to support for lighthouse-parade? Cheers, Owen.

On Mon, Dec 19, 2022 at 3:06 PM Caleb Eby @.***> wrote:

Hey @exortech https://github.com/exortech, I am planning to release a new major version pretty soon (the next few days), on the next branch. Any chance you could reimplement this but targeting the next branch?

— Reply to this email directly, view it on GitHub https://github.com/cloudfour/lighthouse-parade/pull/127#issuecomment-1358534711, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAERGU4MQ7AX2H7T7CT3RLWODS5VANCNFSM6AAAAAATD2KDQY . You are receiving this because you were mentioned.Message ID: @.***>

-- Owen Rogers | Exortech Consulting @exortech https://twitter.com/exortech | http://exortech.com/

calebeby commented 1 year ago

To be honest, I am not a big fan of the simplecrawler library (and the library is now deprecated as well). I would definitely be open to using a different library that may avoid some of simplecrawler's issues, and I'd also be open to adding parameters to configure the behavior of that new crawler. But I think I will release the next version before I make that change, in a future major version.

exortech commented 1 year ago

Makes sense. I noticed that you have replacing simplecrawler on your task list for the next version in #117. So I guess it makes sense to hold off on this change until an alternate crawler is in place.

Closing this PR to submit a new PR for the next branch.