CompVis / adaptive-style-transfer

source code for the ECCV18 paper A Style-Aware Content Loss for Real-time HD Style Transfer
https://compvis.github.io/adaptive-style-transfer/
GNU General Public License v3.0
723 stars 139 forks source link

About GPU memory leak #36

Open xurong1981 opened 4 years ago

xurong1981 commented 4 years ago

Does anyone encounter the following error on "CUDA_ERROR_ILLEGAL_ADDRESS" ? I have changed multiprocessing to single process, but the same problem happened.

My GPU is GeForce RTX 2080 8GB (driver: 440.33.01), and tensorflow: 1.12.0 cuda: 9.0 cudnn: 7.5.0

Training command is like this way: CUDA_VISIBLE_DEVICES=0 python main.py \ --model_name=model_roerich \ --batch_size=1 \ --phase=train \ --image_size=768 \ --lr=0.0002 \ --dsr=0.8 \ --ptcd=./data/Places2/data_large \ --ptad=./data/artist/nicholas-roerich

Finally, the error message are: tensorflow::CurrentStackTrace() stream_executor::cuda::CUDADriver::SynchronizeContext(stream_executor::cuda::CudaContext) stream_executor::StreamExecutor::SynchronizeAllActivity() tensorflow::GPUUtil::SyncAll(tensorflow::Device) tensorflow::BaseGPUDevice::Sync()

Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WorkerLoop(int)
std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)

clone

End stack trace

2020-01-26 00:10:03.045956: E tensorflow/stream_executor/event.cc:34] error destroying CUDA event in context 0x5402c10: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered Traceback (most recent call last): File "/home/username/.local/share/virtualenvs/adaptive-style-transfer-PbxNnQ9W/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/home/username/.local/share/virtualenvs/adaptive-style-transfer-PbxNnQ9W/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/username/.local/share/virtualenvs/adaptive-style-transfer-PbxNnQ9W/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: GPU sync failed