UKPLab / sentence-transformers

State-of-the-Art Text Embeddings
https://www.sbert.net
Apache License 2.0
14.83k stars 2.44k forks source link

Is training_nli.py still working correctly? #795

Closed gevezex closed 3 years ago

gevezex commented 3 years ago

I have created a new conda env for sentence-transformers so I have the latest versions of the packages.

When I execute the script without any arguments I get this error:

python training_nli.py                                                                    ✔  10078  09:32:10
2021-03-05 09:32:29 - Use pytorch device: cuda
2021-03-05 09:32:29 - Read AllNLI train dataset
2021-03-05 09:32:33 - Softmax loss: #Vectors concatenated: 3
2021-03-05 09:32:33 - Read STSbenchmark dev dataset
2021-03-05 09:32:33 - Warmup-steps: 5888
Iteration:   0%|                                                                                                                      | 0/58880 [00:00<?, ?it/s]
Epoch:   0%|                                                                                                                              | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "training_nli.py", line 103, in <module>
    model.fit(train_objectives=[(train_dataloader, train_loss)],
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 561, in fit
    loss_value = loss_model(features, labels)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/losses/SoftmaxLoss.py", line 59, in forward
    reps = [self.model(sentence_feature)['sentence_embedding'] for sentence_feature in sentence_features]
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/losses/SoftmaxLoss.py", line 59, in <listcomp>
    reps = [self.model(sentence_feature)['sentence_embedding'] for sentence_feature in sentence_features]
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/models/Transformer.py", line 40, in forward
    output_states = self.auto_model(**trans_features, return_dict=False)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 966, in forward
    encoder_outputs = self.encoder(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 567, in forward
    layer_outputs = layer_module(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 455, in forward
    self_attention_outputs = self.attention(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 386, in forward
    self_outputs = self.self(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 252, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

Trying this with the following option: CUDA_LAUNCH_BLOCKING=1 python training_nli.py I get this error:

2021-03-05 09:35:34 - Use pytorch device: cuda
2021-03-05 09:35:34 - Read AllNLI train dataset
2021-03-05 09:35:38 - Softmax loss: #Vectors concatenated: 3
2021-03-05 09:35:38 - Read STSbenchmark dev dataset
2021-03-05 09:35:38 - Warmup-steps: 5888
Iteration:   0%|                                                                                                                      | 0/58880 [00:00<?, ?it/s]
Epoch:   0%|                                                                                                                              | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "training_nli.py", line 103, in <module>
    model.fit(train_objectives=[(train_dataloader, train_loss)],
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 561, in fit
    loss_value = loss_model(features, labels)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/losses/SoftmaxLoss.py", line 59, in forward
    reps = [self.model(sentence_feature)['sentence_embedding'] for sentence_feature in sentence_features]
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/losses/SoftmaxLoss.py", line 59, in <listcomp>
    reps = [self.model(sentence_feature)['sentence_embedding'] for sentence_feature in sentence_features]
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/sentence_transformers/models/Transformer.py", line 40, in forward
    output_states = self.auto_model(**trans_features, return_dict=False)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 966, in forward
    encoder_outputs = self.encoder(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 567, in forward
    layer_outputs = layer_module(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 455, in forward
    self_attention_outputs = self.attention(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 386, in forward
    self_outputs = self.self(
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 252, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/ayhan/anaconda3/envs/sbert/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

Any clue. I have a working cuda env so that should not be an issue.

nreimers commented 3 years ago

Appears to be more an issue with CUDA. Try to run the example without CUDA.

I don't know why CUDA is creating this issue.