Open r-barnes opened 6 years ago
Hi @r-barnes ,
I was able to run phase 1 using
python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt
I had to fix a few small issues, though and used python 3 instead of python 2.
Furthermore, I had to resize the images first using the provided resize.py
script.
In data_info.txt
you have to list your pre-processed images as described in the recommended repo.
@arashno
I assume in the output [0, 1]
means empty while [1, 0]
, means animal and accordingly label 0
means empty and label 1
means animal?
Thanks @JohannesBrand : I'll give that a try, though I still think the documentation on this project should be expanded.
Despite looking at the recommended repo, it's still a little unclear how to use this.
An example include an appropriately-formatted input file and a couple of example images would go along way towards making this useful to others.
@r-barnes I also have problems running pre-trained models with images I got. I wonder if you have figured out any clear ways to input the images. Thanks!
Hi All, Sorry about my late reply. I was very busy. I would be happy to improve the documentation. Could you please tell me what part of the documentation is unclear to you? Thanks
Hi @r-barnes ,
I was able to run phase 1 using
python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt
I had to fix a few small issues, though and used python 3 instead of python 2.
Furthermore, I had to resize the images first using the provided
resize.py
script.In
data_info.txt
you have to list your pre-processed images as described in the recommended repo.@arashno I assume in the output
[0, 1]
means empty while[1, 0]
, means animal and accordingly label0
means empty and label1
means animal?
Yes, 0 means empty and 1 means animal.
Thank you for sharing this repo. We would like to use the phase1 model to make predictions of animal vs. no animal in new images. Initially we intend to make predictions without fine-tuning, so our input is images without labels. Therefore the recommended repo (https://github.com/arashno/tensorflow_multigpu_imagenet) does not seem to fit our application. In the recommended repo, data_info.txt includes labels for each image, whereas in our case we do not have labels but are rather interested in predicting the labels using the phase1 model. We have loaded the phase1 model using the code below, but we are new to tensorflow and do not know how to use the model to make predictions on new images. Any advice (especially additional code to make predictions) would be much appreciated! Thanks!
import os cur_dir = "C:/etc/phase1/"
rel_path_meta = "snapshot-55.meta" abs_file_path_meta = os.path.join(cur_dir, rel_path_meta)
print(abs_file_path_meta) import tensorflow as tf
config = tf.ConfigProto(allow_soft_placement=True) with tf.Session(config=config) as sess:
new_saver = tf.train.import_meta_graph(abs_file_path_meta) new_saver.restore(sess, tf.train.latest_checkpoint(cur_dir))
There are two solutions:
1- in this repo (Evolving-AI-Lab/deep_learning_for_camera_trap_images), provide fake labels (for example, all empty or all full or even random labels) and then run the evaluation (i.e. python eval.py ...) of the phase 1 model over the provided labels. Then, in the output file, disregard the fake labels and take out the model predictions only.
2- The recommended repo (arashno/tensorflow_multigpu_imagenet) now support "inference" (prediction), you will need to run a command like this:
python run.py inference preds.txt --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt ...
Please let me know if any part of the explanation is unclear or you have any trouble.
Thanks for the helpful reply, @arashno! We tried solution 1. We get a syntax error in eval.py. I checked that we are able to import datetime in python in the active environment, so that does not seem to be the problem. I guess I may be missing something that will seem obvious once you've pointed it out! Thanks again for troubleshooting.
In C:/Users/etc/Documents/R/bats/phase1, we are not sure whether we have the weights. We have checkpoint, snapshot-55.data-00000-of-00001, snapshot-55.index, and snapshot-55.meta.
Our data_info.txt reads: C:/Users/etc/Documents/R/bats/jpg/Bat_licking_DPS - Copy.mov.jpg 1 C:/Users/etc/Documents/R/bats/jpg/Bat_licking_DPS.mov.jpg 1
Here is the output we get: (r-reticulate) C:\Users\etc\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users/etc/Documents/R/bats/phase1 --path_prefix C:/Users/etc/Documents/R/bats/jpg --data_info data_info.txt File "eval.py", line 7 <!DOCTYPE html> ^ SyntaxError: invalid syntax
snapshot-55.data-00000-of-00001 contains the weights.
Your data_info should be like this:
Bat_licking_DPS - Copy.mov.jpg 1 Bat_licking_DPS.mov.jpg 1
The code will add the value of --path_prefix argument to the path of all images.
I am confused, you mentioned that you were able to fix the syntax error, so what error are you getting now? Line 7 means importing the datatime module.
Hi @arashno -- Thanks for this explanation and guidance.
The invalid syntax error occurred because we had downloaded html file rather than eval.py file. We have solved this issue.
Now we are getting a different error:
(r-reticulate) C:\Users\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/phase1 --path_prefix C:/Users//Documents/R/bats/jpg --data_info data_info.txt
Namespace(architecture='vgg', batch_size=512, crop_size=[224, 224], data_info='data_info.txt', delimiter=',', depth=50, load_size=[256, 256], log_dir='C:/Users//Documents/R/bats/phase1', num_batches=1, num_channels=3, num_classes=2, num_samples=2, num_threads=4, path_prefix='C:/Users//Documents/R/bats/jpg', save_predictions='preds.txt', top_n=2)
Traceback (most recent call last):
File "eval.py", line 127, in
Having looked at read_label_file in data_loader, it’s not clear what this error is about.
Again, thanks a ton for your help! We appreciate any further advice.
It seems to be a delimiter problem. You set the delimiter to the comma (,), but in your input file, you have used space as the delimiter. Your data_info should look like this:
Bat_licking_DPS - Copy.mov.jpg,1 Bat_licking_DPS.mov.jpg,1
Hi @arashno, thank you for pointing this out! We changed data_info.txt as you recommended. We really appreciate your help.
We are now getting a different error that we again can’t figure out:
(r-reticulate) C:\Users\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/phase1 --path_prefix C:/Users//Documents/R/bats/jpg --data_info data_info.txt
Namespace(architecture='vgg', batch_size=512, crop_size=[224, 224], data_info='data_info.txt', delimiter=',', depth=50, load_size=[256, 256], log_dir='C:/Users//Documents/R/bats/phase1', num_batches=1, num_channels=3, num_classes=2, num_samples=2, num_threads=4, path_prefix='C:/Users//Documents/R/bats/jpg', save_predictions='preds.txt', top_n=2)
WARNING:tensorflow:From C:\Users\Documents\R\bats\data_loader.py:32: slice_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(tuple(tensor_list)).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:372: range_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.range(limit).shuffle(limit).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:318: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs)
. If shuffle=False
, omit the .shuffle(...)
.
WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.from_tensors(tensor).repeat(num_epochs)
.
WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:197: QueueRunner.init (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:197: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data
module.
Filling queue with 2000 images before starting to train. This may take some times.
WARNING:tensorflow:From C:\Users\Documents\R\bats\data_loader.py:65: batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data
. Use tf.data.Dataset.batch(batch_size)
(or padded_batch(...)
if dynamic_pad=True
).
Traceback (most recent call last):
File "eval.py", line 127, in
We found this site (https://blog.csdn.net/Felaim/article/details/84098986) that (according to collaborator who reads Chinese) suggests a solution would involve adding a line to common.py:
is_training = tf.cast(True, tf.bool)
But we don’t know where to try adding this.
Thanks for taking a look at this and any advice on a solution!
It seems that there is a version incompatibility. Which repository are you using? (this one or the recommended repo or a mix of them?)
What is your Tensorflow version?
I was using a mix of the two repos. That makes sense that version incompatibility would result -- my mistake.
Using only this repo, I get this error:
(r-reticulate) C:\Users\Documents\R\bats\deep_learning_for_camera_trap_images-master>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/deep_learning_for_camera_trap_images-master/phase1 --path_prefix C:/Users//Documents/R/bats/deep_learning_for_camera_trap_images-master/jpg --data_info data_info.txt
Traceback (most recent call last):
File "eval.py", line 15, in
Here are the Tensorflow versions and other packages in the environment:
conda list
packages in environment at C:\Users\fischhoffi\AppData\Local\conda\conda\envs\r-reticulate:
#
Name Version Build Channel
_tflow_select 2.1.0 gpu anaconda absl-py 0.6.1 py36_1000 conda-forge arch 4.7.0 py36h4a00616_0 bashtage astor 0.7.1 py_0 conda-forge blas 1.0 mkl ca-certificates 2018.03.07 0 anaconda certifi 2018.10.15 py36_0 anaconda cudatoolkit 9.0 1 anaconda cudnn 7.1.4 cuda9.0_0 anaconda cython 0.29.2 py36ha925a31_0 gast 0.2.0 py_0 conda-forge grpcio 1.16.1 py36h351948d_1 anaconda h5py 2.8.0 py36hf7173ca_2 anaconda hdf5 1.8.20 hac2f561_1 anaconda icc_rt 2019.0.0 h0cc432a_1 icu 58.2 ha66f8fd_1 intel-openmp 2019.1 144 jpeg 9c hfa6e2cd_1001 conda-forge keras-applications 1.0.6 py36_0 anaconda keras-preprocessing 1.0.5 py36_0 anaconda libopencv 3.4.2 h20b85fd_0 anaconda libpng 1.6.36 h7602738_1000 conda-forge libprotobuf 3.6.1 h1a1b453_1000 conda-forge libtiff 4.0.10 h36446d0_1001 conda-forge libwebp 1.0.1 hfa6e2cd_1000 conda-forge m2w64-gcc-libgfortran 5.3.0 6 m2w64-gcc-libs 5.3.0 7 m2w64-gcc-libs-core 5.3.0 7 m2w64-gmp 6.1.0 2 m2w64-libwinpthread-git 5.0.0.4634.697f757 2 markdown 2.6.11 py_0 conda-forge mkl 2019.1 144 mkl_fft 1.0.10 py36_0 conda-forge mkl_random 1.0.2 py36_0 conda-forge msgpack-python 0.6.0 py36he980bc4_1000 conda-forge msys2-conda-epoch 20160418 1 numpy 1.15.4 py36h19fb1c0_0 numpy-base 1.15.4 py36hc3f5095_0 opencv 3.4.2 py36h40b0b35_0 anaconda openssl 1.1.1 he774522_0 anaconda pandas 0.23.4 py36h830ac7b_0 patsy 0.5.1 py36_0 pip 18.1 py36_1000 conda-forge protobuf 3.6.1 py36he025d50_1001 conda-forge py-opencv 3.4.2 py36hc319ecb_0 anaconda python 3.6.6 he025d50_0 conda-forge python-dateutil 2.7.5 py36_0 python-editor 1.0.3 py36_0 anaconda pytz 2018.7 py36_0 qt 5.9.7 vc14h73c81de_0 scipy 1.1.0 py36h4f6bf74_1 anaconda setuptools 40.6.3 py36_0 conda-forge six 1.12.0 py36_1000 conda-forge sqlite 3.26.0 he774522_0 statsmodels 0.9.0 py36h452e1ab_0 tensorboard 1.12.0 py36he025d50_0 anaconda tensorflow 1.12.0 gpu_py36ha5f9131_0 anaconda tensorflow-base 1.12.0 gpu_py36h6e53903_0 anaconda tensorflow-gpu 1.12.0 h0d30ee6_0 anaconda termcolor 1.1.0 py_2 conda-forge vc 14.1 h21ff451_3 anaconda vs2015_runtime 15.5.2 3 anaconda werkzeug 0.14.1 py_0 conda-forge wheel 0.32.3 py36_0 conda-forge wincertstore 0.2 py36_1002 conda-forge zlib 1.2.11 h2fa13f4_1003 conda-forge
Would you recommend using this repo or the recommended repo? Thanks again!
Although the other repository is compatible with Python 3, this repository only works with Python 2.7. The import error is because you are using Python 3.6.
I just spent a day figuring out how to run the pre-trained models. Here a few things that I learned that might be useful for others:
1) I am working on Windows in Python 3.6 (also tested 3.7). Both versions work but the xrange() function in eval.py needs to be changed to range()
2) The code works with Tensorflow version 1.8 and 1.9. It also works with 1.12 but there are a lot of warnings since the data structure has changes. Have not tested 1.10 and 1.11.
3) For Phase 2 and Phase 2 Recognition Only the common.py file needs to be copied from the architecture folder to the main folder where eval.py is, else you get a "ModuleNotFoundError: No module named 'common'" error.
4) For For Phase 2 and Phase 2 Recognition Only the --depth parameter needs to be set to 152 (for the Resnet 152 model). The default value is 50.
5) For Phase 1 values in the second column of the image file (data_info.txt) need to be either 0 or 1
6) On my notebook with a Quadro M2000 with 4GB of RAM I ran out of GPU memory. The models worked fine on a GTX 1080 TI with 11GB or RAM. I tried smaller batch sizes but that did not help.
While Phase 1 and Phase 2 Recognition Only work fine I still have not been able to run Phase 2. Will write another post with the errors I am getting.
It would be nice if the authors could provide a small test dataset with all the input files and commands to run each phase. Would probably save a lot of people a lot of time. That said, thanks for making the code and pre-trained models available!
For Phase 2 I created a data_info.txt file with the image name plus 9 extra columns with all 0: image1.jpg,0,0,0,0,0,0,0,0,0 without that I would get an error from the data_loader. Now I am getting the error below. Any suggestions are welcome.
Filling queue with 2000 images before starting to train. This may take some times.
Traceback (most recent call last):
File "eval.py", line 204, in <module>
main()
File "eval.py", line 200, in main
evaluate(args)
File "eval.py", line 33, in evaluate
top1acc= [None]*len(logits)
TypeError: object of type 'NoneType' has no len()
Although the other repository is compatible with Python 3, this repository only works with Python 2.7. The import error is because you are using Python 3.6.
Thanks for letting us know @arashno!
hi @arashno @matobler, i am new to github so i was wondering if it's possible to run the pre trained model of phase 2 on google colab? if yes how can i do it? any help from anyone would be really appreciated.
@Mo-nasr : That might do better as a separate question/issue.
I agree with @matobler :
It would be nice if the authors could provide a small test dataset with all the input files and commands to run each phase. Would probably save a lot of people a lot of time.
Have been running into similar issues as mentioned by previous people. Took the fork of Mo-nasr and adopted it, thanks for that!
Follow the updated read-me to get it running. New features:
I will try to clean this up and convert more and more eventually. Right now, I am not seeing very good classification results though, the only thing that is classified correctly are elephants.
Despite looking at the recommended repo, it's still a little unclear how to use this.
An example include an appropriately-formatted input file and a couple of example images would go along way towards making this useful to others.