Modify `BertTokenizer` - Githubissues

When I use BertTokenizer I felt some points needs to be modified.

After spliting texts by whitespace `, subsequential components are missing `.

Example Input : Hello, I like dog. Actual tokens : ["Hello", "I", "like", "dog"] Expected tokens in my case : ["Hello", " I", " like", " dog"]

All of inputs are converted into lowercased but I guess this behavior might not be correct (depends on vocab.json).

I fixed above two things. please correct or close this PR if I am wrong.

huggingface / swift-coreml-transformers