Closed infogulch closed 7 months ago
Besides a few tests that have the issue mentioned above in the review I think this should work fine.
I'd like to get some input on the review above before I convert this from a draft.
@infogulch I think the fallback image sources in the translator function you added look clean and make sense to me, including the HTML parsing code. I had no clue that many images stash their images in there, lol.
@infogulch update looks good to me.
I might create a separate issue to think about what to do with naked HTML markup within tags.
Thank you for your contribution @infogulch !
Now I just need to tackle #210, and hopefully turn back on gating of PRs for tests passing.
I'd like to comment that fetching the first <img>
inside body isn't such a great idea.
Take for example the feed from slashdot: https://rss.slashdot.org/Slashdot/slashdotMain
The first image in the body will be https://a.fsdn.com/sd/twitter_icon_large.png which is 56x20 pixels. This is directly unsuitable as a thumbnail for an article.
Perhaps it would be better to place the first body image as an extension
? Then clients can choose if they want to consider it or not?
Additional locations where images are attempted to be extracted:
<img>
in content or descriptionFixes #133