Closed FuHongy closed 6 years ago
For batch_size=1: ~4.3GB
Without diving too deep into details I would recommend keeping batch_size>1 but reducing dim_hidden in cfg.py (L77, LSTM hidden state size) to somewhat ~500; also consider reducing "n_lstm_step" there.
With that, you can easily squeeze it under 3GB.
Thank you for your reply! It helps me a lot.
Thanks for your excellent work! @ramanishka I just have a GTX 1050 Ti card. When i was training for MSR-VTT, i always got out of memory error even i decrease the batch_size=1. I'd like to how much memory does it need at least? Thanks.
It shows like this: (base) E:\caption-guided-saliency>python run_s2vt.py --train D:\Program Files (x86)\Anaconda3\lib\site-packages\h5py__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_converters .\experiments\msr-vtt\ run_s2vt.py:61: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value insteadSee the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy train_vids['video_path'] = train_vids['video_id'].map(lambda x: os.path.join(cfg.path_to_trainval_descriptors, x + "_incp_v3.npy")) run_s2vt.py:62: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy val_vids['video_path'] = val_vids['video_id'].map(lambda x: os.path.join(cfg.path_to_trainval_descriptors, x + "_incp_v3.npy")) preprocessing word counts and creating vocab based on word count threshold 1 filtered words from 23667 to 23667 2018-10-09 18:25:30.891834: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2018-10-09 18:25:31.069295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties: name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.43 pciBusID: 0000:01:00.0 totalMemory: 4.00GiB freeMemory: 3.30GiB 2018-10-09 18:25:31.073234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0 2018-10-09 18:25:31.456236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-10-09 18:25:31.459356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0 2018-10-09 18:25:31.461481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0: N 2018-10-09 18:25:31.463833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3686 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) 2018-10-09 18:25:31.470165: E tensorflow/stream_executor/cuda/cuda_driver.cc:903] failed to allocate 3.60G (3865470464 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-10-09 18:25:31.473418: E tensorflow/stream_executor/cuda/cuda_driver.cc:903] failed to allocate 3.24G (3478923264 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY [0, 13020, 26040, 39060, 52080, 65100, 78120, 91140, 104160, 117180] 2018-10-09 18:25:48.659384: E tensorflow/stream_executor/cuda/cuda_driver.cc:903] failed to allocate 700.42M (734439680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2018-10-09 18:25:48.663051: E tensorflow/stream_executor/cuda/cuda_driver.cc:903] failed to allocate 700.42M (734439680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY forrtl: error (200): program aborting due to control-C event Image PC Routine Line Source libifcoremd.dll 00007FFE094694C4 Unknown Unknown Unknown KERNELBASE.dll 00007FFEA36A56FD Unknown Unknown Unknown KERNEL32.DLL 00007FFEA47F3034 Unknown Unknown Unknown ntdll.dll 00007FFEA6501461 Unknown Unknown Unknown