algorithmiaio / sample-apps

Sample apps and sites with Algorithmia integration
195 stars 119 forks source link

Which path do we set "Deep Dive into Object Detection with Open Images, using Tensorflow"'s codes? #9

Open MatsumotoHiroko opened 6 years ago

MatsumotoHiroko commented 6 years ago

Hello everyone, I'm a beginner of TensorFlow (and Machine Learning).

I'd like to use Object Detection with openimages. I am trying to run your article titled by "Deep Dive into Object Detection with Open Images, using Tensorflow". https://blog.algorithmia.com/deep-dive-into-object-detection-with-open-images-using-tensorflow/ I have progressed to "Step1: Image Downloading".

Right now, I can't understand how I do set parameters and I am stopping. Which file do we set at all paths? Particularly, "datapoints_input_path" in "process_images.py"? https://github.com/algorithmiaio/sample-apps/blob/master/deep_dive_demos/open_images_detection/preprocessing/process_images.py#L49 After "process_images.py" has finished, "datapoints_output_path"'s output file was empty. In the first place, I might probably mistake my paramater.

Could you teach us about which file we give these codes?

By the way, I fed below into the codes.

translate_class_descriptions.py

trainable_classes_path=/tmp/data/classes/classes-trainable.txt
class_description_path=/tmp/data/classes/class-descriptions.csv
trainable_translated_path=/tmp/output/trainable_translated.csv

process_metadata.py

annotations_input_path=/tmp/data/bbox_annotations/train/annotations-human-bbox.csv
image_index_input_path=/tmp/data/images/train/images.csv
point_output_path=/tmp/output/point_output.csv
image_index_output_path=/tmp/output/image_index_output.csv
trainable_classes_path=/tmp/data/classes/classes-trainable.txt

download_images.py

images_path /tmp/output/image_index_output.csv
images_output_directory= /tmp/download_images

process_images.py

image_directory=/tmp/download_images
image_saving_directory=/tmp/save_images
datapoints_input_path=/tmp/output/point_output.csv (I set "process_metadata.py"'s output into this path.)
datapoints_output_path=/tmp/output/point_output_process.csv

I'm looking foword your advice šŸ™‡ā€ā™€ļø Thank you.

MatsumotoHiroko commented 6 years ago

I'd like to send you additional information. My error has happened on this line. https://github.com/algorithmiaio/sample-apps/blob/master/deep_dive_demos/open_images_detection/preprocessing/process_images.py#L34

checking if images are valid from label index:   0%|                                                                      | 0/315 [00:00<?, ?it/s]stored_path::/tmp/download_images/0000626c3e317247.jpg
image_opend
open_im::::
im.verify()
verify_im:::::<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=4452x3144 at 0x7FF843DA9B00>

Traceback (most recent call last):
  File "/anaconda/envs/py35/lib/python3.5/site-packages/PIL/ImageFile.py", line 139, in load
    read = self.load_read
AttributeError: 'JpegImageFile' object has no attribute 'load_read'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "process_images.py", line 68, in <module>
    filtered_points = process_images(images_directory, resized_directory, points)
  File "process_images.py", line 39, in process_images
    im.thumbnail((256, 256))
  File "/anaconda/envs/py35/lib/python3.5/site-packages/PIL/Image.py", line 1812, in thumbnail
    im = self.resize(size, resample)
  File "/anaconda/envs/py35/lib/python3.5/site-packages/PIL/Image.py", line 1543, in resize
    self.load()
  File "/anaconda/envs/py35/lib/python3.5/site-packages/PIL/ImageFile.py", line 143, in load
    read = self.fp.read
AttributeError: 'NoneType' object has no attribute 'read'
MatsumotoHiroko commented 6 years ago

I can resolve this problem myself at "process_images.py". We must reopen image after call im.verify(). http://pillow.readthedocs.io/en/4.3.x/reference/Image.html?highlight=verify#PIL.Image.Image.verify

In addition, we must feed format's parameter of "JPEG" instead of "JPG" in im.save. http://pillow.readthedocs.io/en/4.3.x/handbook/image-file-formats.html#fully-supported-formats

Furthermore, I have a mistake. I should set "/tmp/data/output/image_index_output.csv" instead of "/tmp/data/output/point_output.csv" in a parameter of "datapoints_input_path"

After that, I might send a pull request.