Closed ksjae closed 4 years ago
Hi! Could you paste here the result of pip list
in your environment ?
absl-py 0.9.0
apex 0.1
astor 0.8.1
astunparse 1.6.3
backcall 0.1.0
beautifulsoup4 4.9.1
blis 0.4.1
Bottleneck 1.3.2
cachetools 4.1.0
catalogue 1.0.0
certifi 2020.6.20
chardet 3.0.4
click 7.1.2
cycler 0.10.0
cymem 2.0.3
Cython 0.29.20
decorator 4.4.2
fastai 1.0.61
fastprogress 0.2.3
filelock 3.0.12
fire 0.3.1
future 0.18.2
gast 0.2.2
gluonnlp 0.9.1
google-auth 1.18.0
google-auth-oauthlib 0.4.1
google-pasta 0.2.0
graphviz 0.8.4
grpcio 1.29.0
h5py 2.10.0
idna 2.8
importlib-metadata 1.6.1
ipython 7.14.0
ipython-genutils 0.2.0
jedi 0.17.0
joblib 0.15.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
kiwisolver 1.2.0
kobert-transformers 0.4.1
kogpt2 0.1.1
kss 1.3.1
Markdown 3.2.2
matplotlib 3.2.2
mecab-python3 1.0.0
murmurhash 1.0.2
mxnet 1.6.0
natto 0.1.7
numexpr 2.7.1
numpy 1.19.0
nvidia-ml-py3 7.352.0
oauthlib 3.1.0
opt-einsum 3.2.1
packaging 20.4
pandas 1.0.5
parso 0.7.0
pdf2image 1.9.0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 6.2.0
pip 20.1.1
plac 1.1.3
preshed 3.0.2
prompt-toolkit 3.0.5
protobuf 3.12.2
psutil 5.7.0
ptyprocess 0.6.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
Pygments 2.6.1
pyparsing 2.4.7
pytesseract 0.2.7
python-dateutil 2.8.1
pytz 2020.1
PyYAML 5.3.1
regex 2017.4.5
requests 2.21.0
requests-oauthlib 1.3.0
rsa 4.6
sacremoses 0.0.43
scikit-learn 0.23.1
scipy 1.4.1
sentencepiece 0.1.91
setuptools 41.2.0
six 1.14.0
soupsieve 2.0.1
soynlp 0.0.493
spacy 2.3.0
srsly 1.0.2
tensorboard 1.15.0
tensorboard-plugin-wit 1.6.0.post3
tensorflow 1.15.0
tensorflow-estimator 1.15.1
termcolor 1.1.0
thinc 7.4.1
threadpoolctl 2.1.0
tokenizers 0.7.0
torch 1.5.1+cu101
torchvision 0.6.1+cu101
tqdm 4.46.1
traitlets 4.3.3
transformers 2.11.0
urllib3 1.24.3
wasabi 0.7.0
wcwidth 0.1.9
Werkzeug 1.0.1
wheel 0.34.2
wrapt 1.12.1
zipp 3.1.0
tokenizers 0.8.1rc1
transformers 3.0.2
Is there anything else I should post?
Bumping @sgugger to analyze this issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
🐛 Bug
Information
Model I am using (Bert, XLNet ...): GPT2-medium & large
Language I am using the model on (English, Chinese ...): Korean (with custom trained tokenizer)
The problem arises when using:
tokenizer = GPT2TokenizerFast.from_pretrained("./data/TOKEN")
config = GPT2Config.from_pretrained('gpt2-medium') model = GPT2LMHeadModel(config=config) tokenizer = GPT2TokenizerFast.from_pretrained("./data/TOKEN", model_max_length=1024)
print('loading dataset...') dataset = LineByLineTextDataset( tokenizer=tokenizer, file_path="./data/kowiki.txt", block_size=512, )
training_args = TrainingArguments( output_dir='./m', # output directory num_train_epochs=1, # total # of training epochs per_device_train_batch_size=1, # batch size per device during training - the higher the better, but may OOM per_device_eval_batch_size=1, # batch size for evaluation logging_dir='./logs', # directory for storing logs save_steps=10000, do_train=True )
trainer = Trainer( model=model, # the instantiated Transformers model to be trained args=training_args, # training arguments, defined above train_dataset=dataset, # training dataset ) faulthandler.enable() trainer.train()
loading dataset... Epoch: 0%| | 0/1 [00:00<?, ?it/s] Fatal Python error: Segmentation fault | 0/99996 [00:00<?, ?it/s]
Thread 0x00007f872dfff700 (most recent call first): File "/opt/conda/lib/python3.6/threading.py", line 299 in wait File "/opt/conda/lib/python3.6/threading.py", line 551 in wait File "/opt/conda/lib/python3.6/site-packages/tqdm/_monitor.py", line 69 in run File "/opt/conda/lib/python3.6/threading.py", line 916 in _bootstrap_inner File "/opt/conda/lib/python3.6/threading.py", line 884 in _bootstrap
Thread 0x00007f8736bb5700 (most recent call first): File "/opt/conda/lib/python3.6/threading.py", line 299 in wait File "/opt/conda/lib/python3.6/queue.py", line 173 in get File "/opt/conda/lib/python3.6/site-packages/tensorboard/summary/writer/event_file_writer.py", line 205 in run File "/opt/conda/lib/python3.6/threading.py", line 916 in _bootstrap_inner File "/opt/conda/lib/python3.6/threading.py", line 884 in _bootstrap
Current thread 0x00007f88273e7740 (most recent call first): File "/opt/conda/lib/python3.6/site-packages/torch/cuda/comm.py", line 39 in broadcast_coalesced File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 21 in forward File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/replicate.py", line 71 in _broadcast_coalesced_reshape File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/replicate.py", line 88 in replicate File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 159 in replicate File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 154 in forward File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577 in call File "/opt/conda/lib/python3.6/site-packages/transformers/trainer.py", line 622 in _training_step File "/opt/conda/lib/python3.6/site-packages/transformers/trainer.py", line 499 in train File "trainer.py", line 34 in
Segmentation fault (core dumped)