reppy4620 / Dialog

A PyTorch Implementation of japanese chatbot using BERT and Transformer's decoder
MIT License
72 stars 29 forks source link

gpu memory estimation issue #12

Closed kpis-msa closed 4 years ago

kpis-msa commented 4 years ago

I tried to use the evaluation edition by downloading pretrained weight. But, when I entered the command "python3 run_eval.py", an error occurred below,

Traceback (most recent call last): File "run_eval.py", line 12, in state_dict = torch.load(f'{Config.data_dir}/{Config.fn}.pth') File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/serialization.py", line 585, in load return _legacy_load(opened_file, map_location, pickle_module, *pickle_load_args) File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/serialization.py", line 765, in _legacy_load result = unpickler.load() File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/serialization.py", line 721, in persistent_load deserialized_objects[root_key] = restore_location(obj, location) File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/serialization.py", line 174, in default_restore_location result = fn(storage, location) File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/serialization.py", line 154, in _cuda_deserialize return storage_type(obj.size()) File "/home/m-ishihara/.local/lib/python3.6/site-packages/torch/cuda/init.py", line 480, in _lazy_new return super(_CudaBase, cls).new(cls, args, **kwargs) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.40 GiB already allocated; 11.81 MiB free; 1.48 GiB reserved in total by PyTorch)

After investigating gpu memory usage, I found all the gpu memory was free. Please tell me what's wrong with it.

reppy4620 commented 4 years ago

Thank you for letting me know.

What kind of GPU did you use when running run_eval.py? Maybe, needed VRAM is larger than your GPU has, because BERT is so big. I tried to execute run_eval.py in my environment with GTX 1080Ti (which has 11GB VRAM), so I didn't care how much VRAM does it cost for running that script.

If you only wanna try to evaluation, you can run with CPU, so replace two lines indicated below because state_dict was saved with cuda.

################################
# In run_eval.py
################################

# from
device = torch.device(Config.device)

# to
device = torch.device('cpu')

# from
state_dict = torch.load(f'{Config.data_dir}/{Config.fn}.pth')

# to
state_dict = torch.load(f'{Config.data_dir}/{Config.fn}.pth', map_location=device)

Just in case, I updated the above lines in master branch, therefore, please check run_eval.py.

kpis-msa commented 4 years ago

Thank you very much for your answer. I didn't notice such settings are on the run_eval.py. I would like to modify the script and rerun it. But unfortunately, there is another process running on my machine now. So after the process finishing, I will try.