fterh / rsg-retrivr

This Reddit bot is all about the "too lazy; didn't click" life
https://reddit.com/u/rsg-retrivr
7 stars 2 forks source link

Added Techcrunch selector (experimental) #2

Closed jglim closed 6 years ago

jglim commented 6 years ago

Works okay for articles without bylines ( https://techcrunch.com/2017/10/19/two-google-alums-just-raised-60m-to-rethink-documents/ )

A little bit uglier with a byline ( https://techcrunch.com/2017/10/22/defensible-strategies-for-food-tech-entrepreneurs-facing-the-amazon-juggernaut/ )

Would be ideal to exclude div.byline in the body selector but the current implementation doesn't seem to be able to do that yet

fterh commented 6 years ago

Hey, I actually think the byline thing wouldn't be an issue if you changed the selector to div.article-entry > p. That automatically excludes the byline div.

That said, I'm super busy this week so I wouldn't have time to merge your code. Maybe next week.

fterh commented 6 years ago

Hey, starting from v3 I'm no longer planning on using content selectors to manually extract information. So I'm closing your pull request for this one. Appreciate it though! And if you have any further suggestions or future pull requests I'll be happy to consider them! Cheers.