Evolving-AI-Lab / deep_learning_for_camera_trap_images

This repo contains code+pre-trained models for extracting information from camera-trap images. The pre-trained models have been trained on the Snapshot Serengeti dataset.
MIT License
113 stars 53 forks source link

Unclear how to use the code #5

Open r-barnes opened 6 years ago

r-barnes commented 6 years ago

Despite looking at the recommended repo, it's still a little unclear how to use this.

An example include an appropriately-formatted input file and a couple of example images would go along way towards making this useful to others.

JohannesBrand commented 6 years ago

Hi @r-barnes ,

I was able to run phase 1 using

python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt

I had to fix a few small issues, though and used python 3 instead of python 2.

Furthermore, I had to resize the images first using the provided resize.py script.

In data_info.txt you have to list your pre-processed images as described in the recommended repo.

@arashno I assume in the output [0, 1] means empty while [1, 0], means animal and accordingly label 0 means empty and label 1 means animal?

r-barnes commented 6 years ago

Thanks @JohannesBrand : I'll give that a try, though I still think the documentation on this project should be expanded.

jianqiu-xu commented 6 years ago

Despite looking at the recommended repo, it's still a little unclear how to use this.

An example include an appropriately-formatted input file and a couple of example images would go along way towards making this useful to others.

@r-barnes I also have problems running pre-trained models with images I got. I wonder if you have figured out any clear ways to input the images. Thanks!

arashno commented 6 years ago

Hi All, Sorry about my late reply. I was very busy. I would be happy to improve the documentation. Could you please tell me what part of the documentation is unclear to you? Thanks

arashno commented 6 years ago

Hi @r-barnes ,

I was able to run phase 1 using

python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt

I had to fix a few small issues, though and used python 3 instead of python 2.

Furthermore, I had to resize the images first using the provided resize.py script.

In data_info.txt you have to list your pre-processed images as described in the recommended repo.

@arashno I assume in the output [0, 1] means empty while [1, 0], means animal and accordingly label 0 means empty and label 1 means animal?

Yes, 0 means empty and 1 means animal.

fischhoff commented 5 years ago

Thank you for sharing this repo. We would like to use the phase1 model to make predictions of animal vs. no animal in new images. Initially we intend to make predictions without fine-tuning, so our input is images without labels. Therefore the recommended repo (https://github.com/arashno/tensorflow_multigpu_imagenet) does not seem to fit our application. In the recommended repo, data_info.txt includes labels for each image, whereas in our case we do not have labels but are rather interested in predicting the labels using the phase1 model. We have loaded the phase1 model using the code below, but we are new to tensorflow and do not know how to use the model to make predictions on new images. Any advice (especially additional code to make predictions) would be much appreciated! Thanks!

import os cur_dir = "C:/etc/phase1/"

script_dir = os.path.dirname(file) #<-- absolute dir the script is in

rel_path_meta = "snapshot-55.meta" abs_file_path_meta = os.path.join(cur_dir, rel_path_meta)

abs_file_path_meta = os.path.join(script_dir, rel_path_meta)

print(abs_file_path_meta) import tensorflow as tf

config = tf.ConfigProto(allow_soft_placement=True) with tf.Session(config=config) as sess:

with tf.Session() as sess:

new_saver = tf.train.import_meta_graph(abs_file_path_meta) new_saver.restore(sess, tf.train.latest_checkpoint(cur_dir))

arashno commented 5 years ago

There are two solutions:

1- in this repo (Evolving-AI-Lab/deep_learning_for_camera_trap_images), provide fake labels (for example, all empty or all full or even random labels) and then run the evaluation (i.e. python eval.py ...) of the phase 1 model over the provided labels. Then, in the output file, disregard the fake labels and take out the model predictions only.

2- The recommended repo (arashno/tensorflow_multigpu_imagenet) now support "inference" (prediction), you will need to run a command like this:

python run.py inference preds.txt --log_dir path/to/downloaded/phase1/weights --path_prefix path/to/preprocessed/images --data_info data_info.txt ...

Please let me know if any part of the explanation is unclear or you have any trouble.

fischhoff commented 5 years ago

Thanks for the helpful reply, @arashno! We tried solution 1. We get a syntax error in eval.py. I checked that we are able to import datetime in python in the active environment, so that does not seem to be the problem. I guess I may be missing something that will seem obvious once you've pointed it out! Thanks again for troubleshooting.

In C:/Users/etc/Documents/R/bats/phase1, we are not sure whether we have the weights. We have checkpoint, snapshot-55.data-00000-of-00001, snapshot-55.index, and snapshot-55.meta.

Our data_info.txt reads: C:/Users/etc/Documents/R/bats/jpg/Bat_licking_DPS - Copy.mov.jpg 1 C:/Users/etc/Documents/R/bats/jpg/Bat_licking_DPS.mov.jpg 1

Here is the output we get: (r-reticulate) C:\Users\etc\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users/etc/Documents/R/bats/phase1 --path_prefix C:/Users/etc/Documents/R/bats/jpg --data_info data_info.txt File "eval.py", line 7 <!DOCTYPE html> ^ SyntaxError: invalid syntax

arashno commented 5 years ago

snapshot-55.data-00000-of-00001 contains the weights.

Your data_info should be like this:

Bat_licking_DPS - Copy.mov.jpg 1 Bat_licking_DPS.mov.jpg 1

The code will add the value of --path_prefix argument to the path of all images.

I am confused, you mentioned that you were able to fix the syntax error, so what error are you getting now? Line 7 means importing the datatime module.

fischhoff commented 5 years ago

Hi @arashno -- Thanks for this explanation and guidance.

The invalid syntax error occurred because we had downloaded html file rather than eval.py file. We have solved this issue.

Now we are getting a different error:

(r-reticulate) C:\Users\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/phase1 --path_prefix C:/Users//Documents/R/bats/jpg --data_info data_info.txt Namespace(architecture='vgg', batch_size=512, crop_size=[224, 224], data_info='data_info.txt', delimiter=',', depth=50, load_size=[256, 256], log_dir='C:/Users//Documents/R/bats/phase1', num_batches=1, num_channels=3, num_classes=2, num_samples=2, num_threads=4, path_prefix='C:/Users//Documents/R/bats/jpg', save_predictions='preds.txt', top_n=2) Traceback (most recent call last): File "eval.py", line 127, in main() File "eval.py", line 123, in main evaluate(args) File "eval.py", line 25, in evaluate images, labels, urls = data_loader.read_inputs(False, args) File "C:\Users\Documents\R\bats\data_loader.py", line 24, in read_inputs filepaths, labels = _read_label_file(args.data_info, args.delimiter) File "C:\Users\Documents\R\bats\data_loader.py", line 19, in _read_label_file labels.append(int(tokens[1])) IndexError: list index out of range

Having looked at read_label_file in data_loader, it’s not clear what this error is about.

Again, thanks a ton for your help! We appreciate any further advice.

arashno commented 5 years ago

It seems to be a delimiter problem. You set the delimiter to the comma (,), but in your input file, you have used space as the delimiter. Your data_info should look like this:

Bat_licking_DPS - Copy.mov.jpg,1 Bat_licking_DPS.mov.jpg,1

fischhoff commented 5 years ago

Hi @arashno, thank you for pointing this out! We changed data_info.txt as you recommended. We really appreciate your help.

We are now getting a different error that we again can’t figure out:

(r-reticulate) C:\Users\Documents\R\bats>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/phase1 --path_prefix C:/Users//Documents/R/bats/jpg --data_info data_info.txt Namespace(architecture='vgg', batch_size=512, crop_size=[224, 224], data_info='data_info.txt', delimiter=',', depth=50, load_size=[256, 256], log_dir='C:/Users//Documents/R/bats/phase1', num_batches=1, num_channels=3, num_classes=2, num_samples=2, num_threads=4, path_prefix='C:/Users//Documents/R/bats/jpg', save_predictions='preds.txt', top_n=2) WARNING:tensorflow:From C:\Users\Documents\R\bats\data_loader.py:32: slice_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version. Instructions for updating: Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensor_slices(tuple(tensor_list)).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs). If shuffle=False, omit the .shuffle(...). WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:372: range_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version. Instructions for updating: Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.range(limit).shuffle(limit).repeat(num_epochs). If shuffle=False, omit the .shuffle(...). WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:318: input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version. Instructions for updating: Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensor_slices(input_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs). If shuffle=False, omit the .shuffle(...). WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:188: limit_epochs (from tensorflow.python.training.input) is deprecated and will be removed in a future version. Instructions for updating: Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.from_tensors(tensor).repeat(num_epochs). WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:197: QueueRunner.init (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the tf.data module. WARNING:tensorflow:From C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\training\input.py:197: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version. Instructions for updating: To construct input pipelines, use the tf.data module. Filling queue with 2000 images before starting to train. This may take some times. WARNING:tensorflow:From C:\Users\Documents\R\bats\data_loader.py:65: batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version. Instructions for updating: Queue-based input pipelines have been replaced by tf.data. Use tf.data.Dataset.batch(batch_size) (or padded_batch(...) if dynamic_pad=True). Traceback (most recent call last): File "eval.py", line 127, in main() File "eval.py", line 123, in main evaluate(args) File "eval.py", line 30, in evaluate logits = arch.get_model(images, 0.0, False, args) File "C:\Users\Documents\R\bats\arch.py", line 16, in get_model return architectures.vgg.inference(inputs, args.num_classes, wd, 0.5 if is_training else 1.0, is_training) File "C:\Users\Documents\R\bats\architectures\vgg.py", line 32, in inference network = common.batchNormalization(network, is_training= is_training) File "C:\Users\Documents\R\bats\common.py", line 63, in batchNormalization return tf.cond(is_training, lambda: tf.nn.batch_normalization(x, mean, variance, beta, gamma, epsilon), lambda: tf.nn.batch_normalization(x, moving_mean, moving_variance, beta, gamma, epsilon)) File "C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\util\deprecation.py", line 488, in new_func return func(*args, **kwargs) File "C:\Users\AppData\Local\conda\conda\envs\r-reticulate\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2073, in cond raise TypeError("pred must not be a Python bool") TypeError: pred must not be a Python bool

We found this site (https://blog.csdn.net/Felaim/article/details/84098986) that (according to collaborator who reads Chinese) suggests a solution would involve adding a line to common.py:

is_training = tf.cast(True, tf.bool)

But we don’t know where to try adding this.

Thanks for taking a look at this and any advice on a solution!

arashno commented 5 years ago

It seems that there is a version incompatibility. Which repository are you using? (this one or the recommended repo or a mix of them?)

What is your Tensorflow version?

fischhoff commented 5 years ago

I was using a mix of the two repos. That makes sense that version incompatibility would result -- my mistake.

Using only this repo, I get this error:

(r-reticulate) C:\Users\Documents\R\bats\deep_learning_for_camera_trap_images-master>python eval.py preds.txt --num_threads 4 --architecture vgg --log_dir C:/Users//Documents/R/bats/deep_learning_for_camera_trap_images-master/phase1 --path_prefix C:/Users//Documents/R/bats/deep_learning_for_camera_trap_images-master/jpg --data_info data_info.txt Traceback (most recent call last): File "eval.py", line 15, in import arch File "C:\Users\fischhoffi\Documents\R\bats\deep_learning_for_camera_trap_images-master\arch.py", line 1, in import architectures.alexnet File "C:\Users\fischhoffi\Documents\R\bats\deep_learning_for_camera_trap_images-master\architectures\alexnet.py", line 2, in import common ModuleNotFoundError: No module named 'common'

Here are the Tensorflow versions and other packages in the environment:

conda list

packages in environment at C:\Users\fischhoffi\AppData\Local\conda\conda\envs\r-reticulate:

#

Name Version Build Channel

_tflow_select 2.1.0 gpu anaconda absl-py 0.6.1 py36_1000 conda-forge arch 4.7.0 py36h4a00616_0 bashtage astor 0.7.1 py_0 conda-forge blas 1.0 mkl ca-certificates 2018.03.07 0 anaconda certifi 2018.10.15 py36_0 anaconda cudatoolkit 9.0 1 anaconda cudnn 7.1.4 cuda9.0_0 anaconda cython 0.29.2 py36ha925a31_0 gast 0.2.0 py_0 conda-forge grpcio 1.16.1 py36h351948d_1 anaconda h5py 2.8.0 py36hf7173ca_2 anaconda hdf5 1.8.20 hac2f561_1 anaconda icc_rt 2019.0.0 h0cc432a_1 icu 58.2 ha66f8fd_1 intel-openmp 2019.1 144 jpeg 9c hfa6e2cd_1001 conda-forge keras-applications 1.0.6 py36_0 anaconda keras-preprocessing 1.0.5 py36_0 anaconda libopencv 3.4.2 h20b85fd_0 anaconda libpng 1.6.36 h7602738_1000 conda-forge libprotobuf 3.6.1 h1a1b453_1000 conda-forge libtiff 4.0.10 h36446d0_1001 conda-forge libwebp 1.0.1 hfa6e2cd_1000 conda-forge m2w64-gcc-libgfortran 5.3.0 6 m2w64-gcc-libs 5.3.0 7 m2w64-gcc-libs-core 5.3.0 7 m2w64-gmp 6.1.0 2 m2w64-libwinpthread-git 5.0.0.4634.697f757 2 markdown 2.6.11 py_0 conda-forge mkl 2019.1 144 mkl_fft 1.0.10 py36_0 conda-forge mkl_random 1.0.2 py36_0 conda-forge msgpack-python 0.6.0 py36he980bc4_1000 conda-forge msys2-conda-epoch 20160418 1 numpy 1.15.4 py36h19fb1c0_0 numpy-base 1.15.4 py36hc3f5095_0 opencv 3.4.2 py36h40b0b35_0 anaconda openssl 1.1.1 he774522_0 anaconda pandas 0.23.4 py36h830ac7b_0 patsy 0.5.1 py36_0 pip 18.1 py36_1000 conda-forge protobuf 3.6.1 py36he025d50_1001 conda-forge py-opencv 3.4.2 py36hc319ecb_0 anaconda python 3.6.6 he025d50_0 conda-forge python-dateutil 2.7.5 py36_0 python-editor 1.0.3 py36_0 anaconda pytz 2018.7 py36_0 qt 5.9.7 vc14h73c81de_0 scipy 1.1.0 py36h4f6bf74_1 anaconda setuptools 40.6.3 py36_0 conda-forge six 1.12.0 py36_1000 conda-forge sqlite 3.26.0 he774522_0 statsmodels 0.9.0 py36h452e1ab_0 tensorboard 1.12.0 py36he025d50_0 anaconda tensorflow 1.12.0 gpu_py36ha5f9131_0 anaconda tensorflow-base 1.12.0 gpu_py36h6e53903_0 anaconda tensorflow-gpu 1.12.0 h0d30ee6_0 anaconda termcolor 1.1.0 py_2 conda-forge vc 14.1 h21ff451_3 anaconda vs2015_runtime 15.5.2 3 anaconda werkzeug 0.14.1 py_0 conda-forge wheel 0.32.3 py36_0 conda-forge wincertstore 0.2 py36_1002 conda-forge zlib 1.2.11 h2fa13f4_1003 conda-forge

Would you recommend using this repo or the recommended repo? Thanks again!

arashno commented 5 years ago

Although the other repository is compatible with Python 3, this repository only works with Python 2.7. The import error is because you are using Python 3.6.

matobler commented 5 years ago

I just spent a day figuring out how to run the pre-trained models. Here a few things that I learned that might be useful for others:

1) I am working on Windows in Python 3.6 (also tested 3.7). Both versions work but the xrange() function in eval.py needs to be changed to range()

2) The code works with Tensorflow version 1.8 and 1.9. It also works with 1.12 but there are a lot of warnings since the data structure has changes. Have not tested 1.10 and 1.11.

3) For Phase 2 and Phase 2 Recognition Only the common.py file needs to be copied from the architecture folder to the main folder where eval.py is, else you get a "ModuleNotFoundError: No module named 'common'" error.

4) For For Phase 2 and Phase 2 Recognition Only the --depth parameter needs to be set to 152 (for the Resnet 152 model). The default value is 50.

5) For Phase 1 values in the second column of the image file (data_info.txt) need to be either 0 or 1

6) On my notebook with a Quadro M2000 with 4GB of RAM I ran out of GPU memory. The models worked fine on a GTX 1080 TI with 11GB or RAM. I tried smaller batch sizes but that did not help.

While Phase 1 and Phase 2 Recognition Only work fine I still have not been able to run Phase 2. Will write another post with the errors I am getting.

It would be nice if the authors could provide a small test dataset with all the input files and commands to run each phase. Would probably save a lot of people a lot of time. That said, thanks for making the code and pre-trained models available!

matobler commented 5 years ago

For Phase 2 I created a data_info.txt file with the image name plus 9 extra columns with all 0: image1.jpg,0,0,0,0,0,0,0,0,0 without that I would get an error from the data_loader. Now I am getting the error below. Any suggestions are welcome.

Filling queue with 2000 images before starting to train. This may take some times.
Traceback (most recent call last):
  File "eval.py", line 204, in <module>
    main()
  File "eval.py", line 200, in main
    evaluate(args)
  File "eval.py", line 33, in evaluate
    top1acc= [None]*len(logits)
TypeError: object of type 'NoneType' has no len()
fischhoff commented 5 years ago

Although the other repository is compatible with Python 3, this repository only works with Python 2.7. The import error is because you are using Python 3.6.

Thanks for letting us know @arashno!

Mo-nasr commented 4 years ago

hi @arashno @matobler, i am new to github so i was wondering if it's possible to run the pre trained model of phase 2 on google colab? if yes how can i do it? any help from anyone would be really appreciated.

r-barnes commented 2 years ago

@Mo-nasr : That might do better as a separate question/issue.

r-barnes commented 2 years ago

I agree with @matobler :

It would be nice if the authors could provide a small test dataset with all the input files and commands to run each phase. Would probably save a lot of people a lot of time.

AlexSperka commented 2 years ago

Have been running into similar issues as mentioned by previous people. Took the fork of Mo-nasr and adopted it, thanks for that!

Link to my fork

Follow the updated read-me to get it running. New features:

I will try to clean this up and convert more and more eventually. Right now, I am not seeing very good classification results though, the only thing that is classified correctly are elephants.