apple / turicreate

Turi Create simplifies the development of custom machine learning models.
BSD 3-Clause "New" or "Revised" License
11.2k stars 1.14k forks source link

Getting error while extracting features from imgs #3325

Open duchaomin opened 4 years ago

duchaomin commented 4 years ago

I have some jpg imgs for image similarity. Here are the steps:

1, sf = tc.image_analysis.load_images(img_path)
2, sf['deep_features'] = tc.image_analysis.get_deep_features(sf['image'], model_name="resnet-50")

I am getting below error in step2: sf['deep_features'] = tc.image_analysis.get_deep_features(sf['image'], model_name="resnet-50") File "/home/xx/anaconda3/envs/tc/lib/python3.7/site-packages/turicreate/toolkits/image_analysis/image_analysis.py", line 257, in get_deep_features batch_size=batch_size) File "/home/xx/anaconda3/envs/tc/lib/python3.7/site-packages/turicreate/toolkits/_image_feature_extractor.py", line 180, in extract_features images_in_numpy = next_batch() File "/home/xx/anaconda3/envs/tc/lib/python3.7/site-packages/turicreate/toolkits/_image_feature_extractor.py", line 140, in next_batch end_index, File "/home/xx/anaconda3/envs/tc/lib/python3.7/site-packages/turicreate/extensions.py", line 181, in return lambda *args, **kwargs: _run_toolkit_function(fn, arguments, args, kwargs) File "/home/xx/anaconda3/envs/tc/lib/python3.7/site-packages/turicreate/extensions.py", line 169, in _run_toolkit_function raise _ToolkitError(ret[1]) turicreate.toolkits._main.ToolkitError: Unexpected JPEG decode failure

In step1, I find this: turicreate.load_images(url, format='auto', with_path=True, recursive=True, ignore_failure=True, random_order=False) ignore_failure : bool, optional If true, prints warning for failed images and keep loading the rest of the images.

I tested two imgs, one is 0kb and the other is a invalid jpg img with html code in it. Step1 gave the warnings below correctly: Unexpected JPEG decode failure file: /home/xx/img_test/error.jpg Unexpected JPEG decode failure file: /home/xx/img_test/noinfo.jpg and, step2 had no error.

So, why step2 still get a decode failture? How can i pre-process this data. or, skip the error and go on the rest ?

duchaomin commented 4 years ago

Well, maybe I have found the wrong img: https://img.alicdn.com/imgextra/i4/1785908005/TB1fNLJo4TI8KJjSspiXXbM4FXa_!!0-item_pic.jpg It doesn't looks like a whole picture. Perhaps you can catch such errors and skip the image? How can I?

TobyRoseman commented 4 years ago

Thanks for the detailed bug report. I can reproduce this issue with the linked image.

This is a bug. If load_images is called with ignore_failure=False, then calling get_deep_features with those images should not produce a JPEG decode failure.

duchaomin commented 4 years ago

ignore_failure=False ? ? You meant to say =True?
Well, when will you fix the bug? I'am waiting .... also, FYI: https://stackoverflow.com/questions/42462431/oserror-broken-data-stream-when-reading-image-file

Thanks for the detailed bug report. I can reproduce this issue with the linked image.

This is a bug. If load_images is called with ignore_failure=False, then calling get_deep_features with those images should not produce a JPEG decode failure.

TobyRoseman commented 4 years ago

I meant ignore_failure=False. Not ignoring failures means, you error out when failures occur. It would be best to error out early, if we're not ignoring error.

Unfortunately, the stack overflow issue is not relevant. Although TuriCreate does use PIL/Pillow for some functionality, it's not used for any of the code paths here.

duchaomin commented 4 years ago

OK, I think I see what you mean。 Well, when will you fix the bug? or, currently, what can I do to ignore such errors and continue step2, instead of exiting.

I meant ignore_failure=False. Not ignoring failures means, you error out when failures occur. It would be best to error out early, if we're not ignoring error.

Unfortunately, the stack overflow issue is not relevant. Although TuriCreate does use PIL/Pillow for some functionality, it's not used for any of the code paths here.

TobyRoseman commented 4 years ago

I don't have an ETA for the fix. As a workaround I would just remove the problematic image.