Closed iWantToKeepAnon closed 4 years ago
The reason tidy was made default was along the same lines as why firefox automatically fixes broken web pages, I was encountering too many issues with broken websites that were easily solved with a quick tidy.
I'm not going to revert the change, however what I'll do is introduce several global switches (like the debug:true
) switch so you can change the default behavior
I'm not so concerned about the switch as I am that tidy failed and returned a zero length string. Perhaps a small size check and if tidy fails (without throwing an exception), then keep the original html string. (?)
I've added a check to catch if tidy removes more than half of the html. I still need to fix the defaults before I merge to master
Perfect, thanks for a quick turn around! : ))
Prerequisites
Expected Behavior
Rule on gocomics returns og:image in an
<img/>
element.Current Behavior
Empty output. Seems like tidy drops the whole pile of HTML on the floor (i.e. it returns the empty string) but doesn't trigger an exception.
Steps to Reproduce
Make sure tidy is installed in php and function 'tidy_parse_string' is available.
Run url:
https://www.gocomics.com/calvinandhobbes/2020/01/06
Against rule:
Failure Information (for bugs)
I recently did a fresh pull. This seems to be the culprit : https://github.com/feediron/ttrss_plugin-feediron/commit/7e538de5a7ccffda986f8af3fa33a29b7cecbda5#diff-e9fa0d13b445ed0246ee0003db4374d6
Why is a default behavior change a good idea? I added
"tidy-source": false
to all my rules and the above works again. I have a few rules and adding that to each was a bit of a pain, but I am functional again.Seems like a bad idea to me!
Paste Configuration here (see above)