Closed ivanmontero closed 3 years ago
Running the command under the training section of the README, the program fails in the first optimization step with the following message:
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [535,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Which is thrown from the following:
... File "/task_runtime/src/transformers-4.2.1/src/transformers/models/bert/modeling_bert.py", line 956, in forward past_key_values_length=past_key_values_length, ... File "/miniconda/lib/python3.7/site-packages/torch/nn/functional.py", line 2043, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: CUDA error: device-side assert triggered
In other words, the base model (bert-base-cased) encounters an input with a larger sequence length than what it can handle (535 > 512).
Given the above, how do you get around it, and apply your method to entire documents? (i.e., the MS MARCO Document Ranking table)
Found my issue: Was using bert-base-cased instead of bert-base-uncased
Running the command under the training section of the README, the program fails in the first optimization step with the following message:
Which is thrown from the following:
In other words, the base model (bert-base-cased) encounters an input with a larger sequence length than what it can handle (535 > 512).
Given the above, how do you get around it, and apply your method to entire documents? (i.e., the MS MARCO Document Ranking table)