Can I add <pad>? - Githubissues

Thanks for giving this easy-to-use tool.

Currently, I want to use bpemb in my project. In order to process sentences with different lengths in a batch, I have to add padding token <pad> after some sentences. But I find it impossible to do that because bpemb will tear down the `

One way I can think of is to forcibly add <pad> through a complex process. But this process is a bit painful. So, are there other more flexible methods?

By the way, I found that the bpe token with id 0 seldom occurs in processed ids. Can I use it as the padding token?

bheinzerling / bpemb

Can I add <pad>? #52