Closed kkew3 closed 4 months ago
Under transformers==4.41.2, constructing CharacterTokenizer raises NotImplementedError.
transformers==4.41.2
CharacterTokenizer
NotImplementedError
Command to reproduce the error (from command line), plus the error output:
$ python3 -c "from charactertokenizer import CharacterTokenizer; _ = CharacterTokenizer('abc', 1024)" Traceback (most recent call last): File "<string>", line 1, in <module> File "/Users/user/Documents/Projects/python3/proj/dariush-bahrami+character-tokenizer/charactertokenizer/core.py", line 44, in __init__ super().__init__( File "/Users/user/Documents/Projects/python3/proj/dariush-bahrami+character-tokenizer/venv/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 367, in __init__ self._add_tokens( File "/Users/user/Documents/Projects/python3/proj/dariush-bahrami+character-tokenizer/venv/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens current_vocab = self.get_vocab().copy() File "/Users/user/Documents/Projects/python3/proj/dariush-bahrami+character-tokenizer/venv/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1682, in get_vocab raise NotImplementedError() NotImplementedError
which is caused by not implementing the get_vocab() method required by the super class.
get_vocab()
In order to fix the error, I add the required get_vocab() method, and adjust several statements in __init__() in order to get get_vocab() work.
__init__()
Thank you for fixing this issue
Abstract
Under
transformers==4.41.2
, constructingCharacterTokenizer
raisesNotImplementedError
.Minimal reproducible example
Command to reproduce the error (from command line), plus the error output:
which is caused by not implementing the
get_vocab()
method required by the super class.My fix
In order to fix the error, I add the required
get_vocab()
method, and adjust several statements in__init__()
in order to getget_vocab()
work.