inhumantsar / slurp

Slurps webpages and saves them as clean, uncluttered Markdown. Think Pocket, but better.
https://inhumantsar.github.io/slurp/
MIT License
127 stars 2 forks source link

Implement hooks for page processing #37

Open inhumantsar opened 1 month ago

inhumantsar commented 1 month ago

Relying solely on upstream parsers will never yield perfect results, so it would make sense to implement hooks which can be used to apply fixes for common simplification and conversion errors.

They should be available at three places in the slurping process: before simplification, before markdown conversion, and after conversion. Hooks will need to be either universal or site-specific. They should not be user configurable to start. Relevant checks and processing should be migrated to use these hooks but those should be addressed as separate issues.

See also #34, #15, #14.