Improve library usability

When using subword-nmt as a Python library rather than a script, calling BPE.segment() might result in unnecessary string operations.

Consider the following situation, where the user already has a list of tokens and needs a list of segments:

sentence = ' '.join(tokens)
segments = bpe.segment(sentence)
segments = segments.split(' ')

... and inside BPE.segments(), the reverse of these operations happens on the edges, ie. sentence is first split on whitespace, and the segments list is joined to a string before returning.

This pull request adds a new method, BPE.segment_tokens(), which accepts an iterable of tokens and returns a list of segments, while leaving the current API unchanged. This allows avoiding superfluous string operations in the described secenario.

rsennrich / subword-nmt

Improve library usability #52