Many severe gradient calculation errors when running 'make -j runtest'.
Each 10 iterations take close to ~1 hour, loss explodes to nan after 10 iterations.
This happens on machine with P100 gpus, but not on machine with Titan X gpus.
Steps to reproduce
follow tutorial (except use cmake to compile caffe), and after compilation run 'make test && make -j runtest' from $CAFFE_ROOT/build directory
Your system configuration
Operating system: Ubuntu 16.04
Compiler:
CUDA version (if applicable): 8.0
CUDNN version (if applicable): 6.0 (also 5.1)
BLAS: OpenBLAS
Python or MATLAB version (for pycaffe and matcaffe respectively): 2.7
I just noticed that Pascal architectures aren't exactly supported in that caffe release. I'm trying to merge the ssd branch with master but currently the tests are failing.
Issue summary
Many severe gradient calculation errors when running 'make -j runtest'. Each 10 iterations take close to ~1 hour, loss explodes to nan after 10 iterations.
This happens on machine with P100 gpus, but not on machine with Titan X gpus.
Steps to reproduce
follow tutorial (except use cmake to compile caffe), and after compilation run 'make test && make -j runtest' from $CAFFE_ROOT/build directory
Your system configuration
Operating system: Ubuntu 16.04 Compiler: CUDA version (if applicable): 8.0 CUDNN version (if applicable): 6.0 (also 5.1) BLAS: OpenBLAS Python or MATLAB version (for pycaffe and matcaffe respectively): 2.7