Hk669 / bpetokenizer

(py package) train your own tokenizer based on BPE algorithm for the LLMs (supports the regex pattern and special tokens)
https://pypi.org/project/bpetokenizer/
2 stars 1 forks source link

make the `from_pretrained` method available to load the tokenizers from pretrained. #5

Closed Hk669 closed 4 weeks ago

Hk669 commented 4 weeks ago

it should look something like:


@classmethod
def from_pretrained(cls, 
                tokenizer_name: str, 
                pretrained_dir = 'pretrained',
                verbose=False):
Hk669 commented 4 weeks ago

addressed in the PR #6