juanshishido / okcupid

Analyzing online self-presentation
MIT License
5 stars 0 forks source link

urls #6

Open juanshishido opened 8 years ago

juanshishido commented 8 years ago

I remember seeing a comment about there still being URLs in the text. I used the following on another project with some pretty good success: .apply(lambda x: re.sub('(http\S*|www\S*)', '', x)). There are probably ways to improve this. Could be useful if we want to filter those out.

jnaras commented 8 years ago

Sounds good! Yes, some URLS like "youtube.com" come up pretty frequently. Could be useful though