abhishekkrthakur / bert-sentiment

MIT License
269 stars 101 forks source link

IndexError: index ? is out of bounds for axis 0 with size ? #8

Closed mbaghou closed 4 years ago

mbaghou commented 4 years ago

Hello,

I've tried your implementation of bert model in order to predict the sentiment of a sentence.

When training the model on google TPUs, i had this issues, on this line :

review = str(self.review[item])

I've spended so many time to find the issue but i don't figure out why it throws an array index out of bound ?

I started training with a litle dataset of 10000 rows.

StackStrace :

bi = 0, loss = 0.6744452714920044 bi = 10, loss = 0.6480506658554077 bi = 20, loss = 0.6070395708084106 bi = 30, loss = 0.3570273816585541 bi = 40, loss = 0.322771281003952 bi = 50, loss = 0.42349475622177124 bi = 60, loss = 0.2848508358001709 bi = 70, loss = 0.2577969431877136 bi = 80, loss = 0.4233595132827759 bi = 90, loss = 0.600457489490509 bi = 100, loss = 0.22680382430553436 bi = 110, loss = 0.09512724727392197 bi = 120, loss = 0.14158135652542114 bi = 130, loss = 0.653974175453186 Exception in thread Thread-6: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/usr/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.6/dist-packages/torch_xla/distributed/parallel_loader.py", line 134, in _loaderworker , data = next(data_iter) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "", line 12, in getitem review = str(self.review[item]) IndexError: index 4244 is out of bounds for axis 0 with size 979