Fix regex's inconsistent word breaking around apostrophes

Relaxing the dependency on regex had an unintended consequence in 2.3.1: it could no longer get the frequency of French phrases such as "l'écran" because their tokenization behavior changed.

Fix this with a more complex tokenization rule that should handle apostrophes the same across these various versions of regex.

(I ran black so it could format these ugly expressions appropriately; there are some miscellaneous formatting changes to tokens.py that came along as a result.)

rspeer / wordfreq

Fix regex's inconsistent word breaking around apostrophes #77