utterworks / fast-bert

Super easy library for BERT based NLP models
Apache License 2.0
1.85k stars 342 forks source link

Error: cannot import name 'Unigram' from fast_bert.data_cls import BertDataBunch #268

Open zrajabi opened 3 years ago

zrajabi commented 3 years ago

Another weird error once I was running in colab:

in () ----> 1 from fast_bert.data_cls import BertDataBunch 2 from fast_bert.learner_cls import BertLearner 3 from fast_bert.metrics import accuracy 4 import logging 5 import torch 7 frames /usr/local/lib/python3.6/dist-packages/transformers/convert_slow_tokenizer.py in () 22 23 from tokenizers import Tokenizer, decoders, normalizers, pre_tokenizers, processors ---> 24 from tokenizers.models import BPE, Unigram, WordPiece 25 26 # from transformers.tokenization_openai import OpenAIGPTTokenizer ImportError: cannot import name 'Unigram' --------------------------------------------------------------------------- NOTE: If your import is failing due to a missing package, you can manually install dependencies using either !pip or !apt. To view examples of installing some common dependencies, click the "Open Examples" button below.
asnota commented 3 years ago

As proposed in a similar issue https://github.com/huggingface/transformers/issues/7806, upgrade tokenizers. For Colab this worked for me: %pip install --upgrade tokenizers

andrew-miao commented 3 years ago

Make sure the version of tokenizers >= 0.8.1rc1. I also face this issue. You can create a new notebook and install tokenizers %pip install tokenizers before install fast-bert.