My group has a bit of a challenge when it comes to keeping words like New Zealand, Donald Trump and alt-right together while tokenizing. We tried to Google and found a solution, but it seems way too comprehensive compared to how usual a text analysis problem this must be. Any smart ways to handle this?
Hi!
My group has a bit of a challenge when it comes to keeping words like New Zealand, Donald Trump and alt-right together while tokenizing. We tried to Google and found a solution, but it seems way too comprehensive compared to how usual a text analysis problem this must be. Any smart ways to handle this?