Closed jwzimmer-zz closed 3 years ago
KeyError
= when you tried to get the "href" attribute from the <a>
tag, it didn't have one. mentioned way of resolving it in other GH issue by using if trope.has_attr('href')
when looping through
Thanks, @nguyenhphilip, I will do this tmrw! 👍
So, there are some differences in the page structure I think we might want to consider... I don't know how many of the articles fit with either pattern...
Comparing Critic Breakdown and Eagleland:
On the Critic Breakdown trope page, https://tvtropes.org/pmwiki/pmwiki.php/Main/CriticBreakdown, the links of interest I think are just those within the main text of the article:
Which at first blush appears to match those returned by individualtropepage.py:
However, on Eagleland, https://tvtropes.org/pmwiki/pmwiki.php/Main/Eagleland, there is a section at the bottom of the page titled "Related tropes include:", followed by a list. It's not obvious to me whether we should include these... I think we might want to distinguish between the tropes organically linked to within the article test versus the ones people decided were related - I think those might reflect slightly different processes of categorization happening.
So, assuming we don't want to include those, I get the dict: {'Eagleland': ['TruthInTelevision', 'TheBeautiful', 'Utopia', 'ThePromisedLand', 'TheFifties', 'GoodIsOldFashioned', 'TastesLikeDiabetes', 'IncorruptiblePurePureness', 'TheBoorish', 'WretchedHive', 'BloodKnight', 'PointyHairedBoss', 'Jerkass', 'Greed', 'ItsAllAboutMe', 'CondescendingCompassion', 'DeepSouth', 'TheWildWest', 'FatBastard', 'RedScare', 'TheFundamentalist', 'HeteronormativeCrusader', 'GunNut', 'MoralGuardians', 'FastFoodNation', 'GangsterLand', 'RichBitch', 'HollywoodCalifornia', 'TheSocialDarwinist', 'TheScrooge', 'CorruptCorporateExecutive', 'KillThePoor', 'ArsonMurderAndJaywalking', 'GlobalIgnorance', 'Mixed', 'TakeAThirdOption', 'BoisterousBruiser', 'IdiotHero', 'JerkWithAHeartOfGold', 'Trope', 'CulturalCringe', 'BoomerangBigot', 'CreatorProvincialism', 'MemeticMutation', 'WorldOfBadass', 'WretchedHive']}
Thoughts on whether the lists of links should or should not be included, and did we include them on other pages?
Phil's list: everytime any trope in the masterlist links to any other trope in the masterlist anywhere on the page My files: what links are embedded in the text of the article for each trope in the masterlist
Resolved, there is now a linked_tropes_dict matching eagleland.
https://github.com/jwzimmer/tv-tropes/commit/d04f03ebba11dc183e19d6b63db398cccc772eef
I think it was 23919 (or thereabouts): ` it.alltropes[23919] Out[76]: 'Eagleland.html'
it.alltropes[23918] Out[77]: 'WriterRevolt.html'
it.alltropes[23920] Out[78]: 'ReallyRoyaltyReveal.html'
YEP! That's the one:
it.get_lists_tropes("trope_list/tropes/Eagleland.html") Traceback (most recent call last):File "", line 1, in
it.get_lists_tropes("trope_list/tropes/Eagleland.html")
File "/Users/jzimmer1/Documents/GitHub/tv-tropes/individualtropepage.py", line 40, in get_lists_tropes href = link["href"]
File "/Users/jzimmer1/opt/anaconda3/lib/python3.8/site-packages/bs4/element.py", line 1401, in getitem return self.attrs[key]
KeyError: 'href' `