Hi, I'm trying to use longformer in contextualized topic models (github page). I replaced the "paraphrase-distilroberta-base-v2" for "allenai/longformer-base-4096" and I'm getting the following errors:
WARNING:root:No sentence-transformers model found with name /home/extern/user/.cache/torch/sentence_transformers/allenai_longformer-base-4096. Creating a new one with MEAN pooling.
Some weights of the model checkpoint at /home/extern/user/.cache/torch/sentence_transformers/allenai_longformer-base-4096 were not used when initializing LongformerModel: ['lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head
.layer_norm.bias', 'lm_head.bias']
- This IS expected if you are initializing LongformerModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LongformerModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Batches: 0%| | 0/1194 [00:00<?, ?it/s]
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [66,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [67,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [68,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [69,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [70,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [71,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [72,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [73,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [74,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [259,0,0], thread: [75,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
[. . .]
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [389,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:703: indexSelectLargeIndex: block: [389,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Batches: 0%| | 0/1194 [00:03<?, ?it/s]
Traceback (most recent call last):
File "ctm/tm_preparation_4096.py", line 52, in <module>
training_dataset = tp.fit(text_for_contextual=unpreprocessed_corpus, text_for_bow=preprocessed_documents)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/contextualized_topic_models/utils/data_preparation.py", line 69, in fit
train_contextualized_embeddings = bert_embeddings_from_list(text_for_contextual, self.contextualized_model)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/contextualized_topic_models/utils/data_preparation.py", line 36, in bert_embeddings_from_list
return np.array(model.encode(texts, show_progress_bar=True, batch_size=batch_size))
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/sentence_transformers/SentenceTransformer.py", line 164, in encode
out_features = self.forward(features)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/sentence_transformers/models/Transformer.py", line 66, in forward
output_states = self.auto_model(**trans_features, return_dict=False)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/transformers/models/longformer/modeling_longformer.py", line 1703, in forward
embedding_output = self.embeddings(
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/transformers/models/longformer/modeling_longformer.py", line 485, in forward
position_embeddings = self.position_embeddings(position_ids)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
return F.embedding(
File "/home/extern/user/.conda/envs/ctm_env/lib/python3.8/site-packages/torch/nn/functional.py", line 2183, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered
I am in a multi-gpu environment. Am I doing something wrong?
Hi, I'm trying to use longformer in contextualized topic models (github page). I replaced the "paraphrase-distilroberta-base-v2" for "allenai/longformer-base-4096" and I'm getting the following errors:
I am in a multi-gpu environment. Am I doing something wrong?