Open connorjoleary opened 3 years ago
Looks like that part of the page doesn't have a paragraph tag
Oh man, Idk how to properly grab text in this situation.
texts = soup.findAll()
static = list(filter(self.tag_visible, texts))
``` returns duplicates of each section
soup.get_text()
returns the text only once, but only the text, not the hrefs with it[d.text for d in soup.findAll() if not d.find() and d.text]
seems to be the closest, but doesn't return the
line which stated this whole ticket (that line has children or descendants, idk which term is correct)Another example of this bug: https://www.reddit.com/r/todayilearned/comments/osthf5/til_it_is_common_for_truck_drivers_in_india_to/
Describe the bug true source not given as an option
To Reproduce Steps to reproduce the behavior: https://www.reddit.com/r/todayilearned/comments/n9evzh/til_theres_roughly_100_firefighter_arsonists/ full quote with that website as link
Expected behavior There is literally a quote in the source, how did deepcite miss this?