galsang / BiDAF-pytorch

Re-implementation of BiDAF(Bidirectional Attention Flow for Machine Comprehension, Minjoon Seo et al., ICLR 2017) on PyTorch.
244 stars 85 forks source link

Cuda out of memory will fix when pytorch update to 1.0 #7

Closed BITLsy closed 5 years ago

BITLsy commented 5 years ago

Thanks for your code, very clear! When i first clone the code and run in my gtx 1060 6G GPU, it exploded soon. I try to delect some intermediate variable in the Bidaf model, but have no effect. Finally, update pytorch to 1.0 version and run again, Nvidia-smi show only about 4G GPU memory be used, even add up the train_batch_size to 128. So cool !

galsang commented 5 years ago

How nice it is! Thanks for your test.

dominikabasaj commented 5 years ago

Hi, it did not work for me though - did you do any additional changes to the code? I have GPU 1070 8 gb.

BITLsy commented 5 years ago

@dominikabasaj I add code in train and test iterator

del batch
if i % 100 == 0:
    gc.collect()

but I don't know whether it work or not