Closed petewarden closed 10 years ago
You might look at the changes in dev for grayscale processing to compare against your work.
I'll take a glance at your fork if I can -- keep up your efforts as I know there are groups working on handwriting and scene text recognition and other intensity image data with success.
Le vendredi 18 juillet 2014, Pete Warden notifications@github.com a écrit :
I'm working on an O'Reilly webcast focused on getting started with Caffe: http://www.oreilly.com/pub/e/3121
One of my goals is to show folks how to quickly train their own simple network, and I'm able to walk them through training MNIST. The hard part is demonstrating how to use the resulting network on an image.
I modified python/classifier.py to optionally output human-readable scores, which I've been able to use successfully with the prebuilt Imagenet. It's been a struggle to get the same thing working with MNIST though. I've made a bunch of changes so it can handle single-channel grayscale inputs, but I still can't get workable results. The example digit images I try appear to result in random predictions.
Here's my fork, just containing changes in the Python scripts to support MNIST: https://github.com/jetpacapp/caffe
To run the prediction on a digit image, use this command: python python/classify.py --print_results --model_def examples/mnist/lenet.prototxt --pretrained_model examples/mnist/lenet_iter_10000 --force_grayscale --center_only --labels_file data/mnist/mnist_words.txt data/mnist/sample_digit.png foo
I'd hoped to see labels corresponding to the digits in the images I put in, but in fact I seem to see fairly arbitrary results.
Any ideas on how to accomplish my goal of demonstrating MNIST on user-provided images?
— Reply to this email directly or view it on GitHub https://github.com/BVLC/caffe/issues/729.
Evan Shelhamer
After digging deeper, the problem was the default image dimensions of 256x256, since MNIST crops to 28x28, so only the very center of each image was being analyzed. Once I figured that out, I was able to get the right results with this modified command-line on my fork:
python python/classify.py --print_results --model_def examples/mnist/lenet.prototxt --pretrained_model examples/mnist/lenet_iter_10000 --force_grayscale --center_only --labels_file data/mnist/mnist_words.txt --images_dim 28,28 data/mnist/sample_digit.png foo
Since it looks like my modifications do allow python/classify.py to work with MNIST as well as Imagenet, I'll put together a pull request from my fork. Having them in the main project would be a big help for the tutorial, that way I can direct folks here rather than messing around with forks.
I've added the pull request as #735 .
Thanks for holding the Caffe webinar and the extension to classify.py for grayscale inputs and human-readable output. Closing since this is carried out by #816 taking over for #735.
python python/classify.py --print_results --model_def examples/mnist/lenet.prototxt --pretrained_model examples/mnist/lenet_iter_10000 --force_grayscale --center_only --labels_file data/mnist/mnist_words.txt --images_dim 28,28 data/mnist/sample_digit.png foo in the path what is this foo?
I have trained and tested my image datasets using Caffe, and the data in val.txt is the ground truth.
But, how can I calculate the precision and recall by the ground truth and prediction values?
Or, can I get the precision and recall from "python/classify.py"? Could you pls give me some suggestions and recommend some python codes?
I tried running the command that you have suggested but i am getting an error saying that Exception: Channel swap needs to have the same number of dimensions as the input channels. I am passing an RGB image as input as --force_graysclale is there it should convert that image to a single channel but it seems it is performing that . Can you suggest any other alternatives ?
it's hard for new users to finish this work.
I did it , checkout this: https://github.com/9crk/caffe-mnist-test
I'm working on an O'Reilly webcast focused on getting started with Caffe: http://www.oreilly.com/pub/e/3121
One of my goals is to show folks how to quickly train their own simple network, and I'm able to walk them through training MNIST. The hard part is demonstrating how to use the resulting network on an image.
I modified python/classifier.py to optionally output human-readable scores, which I've been able to use successfully with the prebuilt Imagenet. It's been a struggle to get the same thing working with MNIST though. I've made a bunch of changes so it can handle single-channel grayscale inputs, but I still can't get workable results. The example digit images I try appear to result in random predictions.
Here's my fork, just containing changes in the Python scripts to support MNIST: https://github.com/jetpacapp/caffe
To run the prediction on a digit image, use this command:
python python/classify.py --print_results --model_def examples/mnist/lenet.prototxt --pretrained_model examples/mnist/lenet_iter_10000 --force_grayscale --center_only --labels_file data/mnist/mnist_words.txt data/mnist/sample_digit.png foo
I'd hoped to see labels corresponding to the digits in the images I put in, but in fact I seem to see fairly arbitrary results.
Any ideas on how to accomplish my goal of demonstrating MNIST on user-provided images?