Closed jecarr closed 3 years ago
Hello @jecarr and thank you for the bug report. TRAM has moved to https://github.com/center-for-threat-informed-defense/tram and this issue is no longer present in that repository so I am closing this issue. Thank you!
Please Describe The Problem To Be Solved
Rare but a page may literally have the text 'src='. Example url. This breaks TRAM's analysis because during map_all_html(), 'src=' is considered an image element which then breaks
soup.img['src']
.Proposed Change
I have a fix here - arachne-threat-intel/thread@9dd98bd - but haven't created a PR due to maintaining #61. The approach is to use BeautifulSoup's findAll() method for images. This will allow the code to continue even if an image is not found.