The profanity in the "what's your favourite font question?" has two implications:
You could actually do some analysis of a binary indicator of profanity vs. no profanity. How does that relate to other things? Age, candy preference, favourite font, timestamp, ...
Before we can make this a data package for the world, that's got to get cleaned up. No way I would submit that to CRAN.
From helpful people on Twitter (see the replies to this), I've learned of some official naughty word lists:
The profanity in the "what's your favourite font question?" has two implications:
From helpful people on Twitter (see the replies to this), I've learned of some official naughty word lists: