Open EruditeLying opened 5 months ago
BTW, one more thing to note - I see somehow none of your PRs submitted that day got a workflow run (with the usual checks), so maybe something like rebasing on top of current master
would be useful too.
I don't think this is the right approach. The Substack translator is very basic. On that test case, Embedded Metadata already gets the author name, abstract, and date. The item type is wrong - webpage
instead of blogPost
- but the only field it's missing is websiteType
("Substack newsletter", which isn't necessarily correct for custom-domain Substack blogs). It would be much better to detect Substack in EM and set the item type correctly than to turn this into a generic translator.
Remove the substack.com domain from the target regex to support Substacks with a custom domain
In detectWeb, manually check for the old target regex first, then for one of the Substack footer buttons in the page DOM. If neither matches, return false