Open MatthewInkawhich opened 6 years ago
I failed at "workspace.RunNetOnce(train_model.param_initnet)", with error "[enforce fail at db.h:206] db. Cannot open db: /home/henry/caffe2_notebooks/tutorial_data/mnist/mnist-train-nchw-lmdb of type lmdb Error from operator: ..."
@henryguyu Perhaps this occurred when you were pasting error into your comment, but check your path. The MNIST db should be located at:
/home/henry/caffe2_notebooks/tutorial_data/mnist/mnist-train-nchw-lmdb
Notice mnist
not minist
, and mnist-train-nchw-lmdb
not "minist-train-nchw-lmdb
.
@MatthewInkawhich just my typo, anyway, thanks. I tried it with jupyter notebook, will take a try from terminal.
Machine: MacBook Pro 2015 OS: MacOS High Sierra Jupyter: Version 4.4.0 Caffe2: Conda CPU-only install Python: Version 2.7.14 Anaconda
When running the MNIST.ipynb tutorial out of a Jupyter Notebook, my model does not train. My training accuracy either starts at 1.0, or it starts as expected (~0.10) and immediately goes to 1.0. Meanwhile, when I print the loss I see "nan".
When I copied the code over to a python script and ran via terminal (Mac), it trains as intended.
I also tried copying my code from a training script that I wrote( and know works) into one block in a Jupyter Notebook as a check. Again, when run out of the Notebook, training accuracy goes to 1.0, while loss is "nan". I am not encountering this issue on my other Mac laptop, on which Caffe2 was last compiled in January.
Here is the training loop in my script:
Snippet of output from my script when run out of script from terminal:
Snippet of output from my script when run out of Jupyter Notebook:
Here are the graphs from the MNIST tutorial Notebook:
Has anyone else encountered this and/or have a solution?
Thanks, Matt