keras-team / keras-preprocessing

Utilities for working with image data, text data, and sequence data.
Other
1.02k stars 444 forks source link

To solve issue #13637 in keras repo #270

Closed seriousran closed 4 years ago

seriousran commented 4 years ago

Summary

Set function sort 'classes' To solve it, use OrderedDict

Related Issues

https://github.com/keras-team/keras/issues/13637

PR Overview

rragundez commented 4 years ago

The parameter classes indicates which classes to take into account. So even if the classes are given in a different order by the user the class_indices mapping will not change which the desired functionality given the classes parameter definition: "Optional list of strings, classes to use (e.g. ["dogs", "cats"])."

seriousran commented 4 years ago

@rragundez But, flow_from_directory and flow_from_dataframe work differently as keras-team/keras#13637

rragundez commented 4 years ago

agree, that is a different issue than "not working" though. Let me check with @Dref360 . What do you think? I believe it should be one way that classes should work. On one hand I think it should independent on the order the user gives (serving just as a filter and nothing more, this also allows to give as a set), but on the other flow_from_directory predates by far flow_from_dataframe so perhaps for backwards compatibility is better to use hte flow_from_directory behaviour.

Dref360 commented 4 years ago

I agree with you that *_directory is far more used and we should stick to its behavior.

I think this PR fixes the behavior? If it is the case, please add a test following keras-team/keras#13637 repro steps and edit the documentation accordingly.

qinst64 commented 4 years ago

@rragundez This is a bug as my exmaple. The class order shouldn't be reset silently, this leads to unknown class order of predicted results, and then unknown calcluation of confustion matrix, and precition, recall etc (i.e. which class is definded as positive?) Again with my example, if flow_from_directory (right behavior) then

flow_from_directory has the right behavior for users to set which is the positive class; flow_from_dataframe doesn't.

rragundez commented 4 years ago

@qinst64 class_indices gives the order.

@seriousran I don't really like the behaviour but as we point out is better to have a single one and the *_directory is used far more. I'll try to check the PR this weekend. Please add tests thanks!

rragundez commented 4 years ago

Please add unit tests, I cannot review nor merge this PR without them. Thanks!

seriousran commented 4 years ago

Couldn't pass the test with this error. But I can't understand why the rror occured in directory_iterator.py file. my commits didn't include that file. I will check it out again.

tests/image/directory_iterator_test.py ..*** Error in `/home/travis/miniconda/envs/test-environment/bin/python': double free or corruption (!prev): 0x0000564167c7d9c0 ***
Fatal Python error: Aborted

Thread 0x00007f253a71b740 (most recent call first):
  File "/home/travis/build/keras-team/keras-preprocessing/keras_preprocessing/image/directory_iterator.py", line 140 in __init__
  File "/home/travis/build/keras-team/keras-preprocessing/keras_preprocessing/image/image_data_generator.py", line 542 in flow_from_directory
  File "/home/travis/build/keras-team/keras-preprocessing/tests/image/directory_iterator_test.py", line 204 in test_directory_iterator_with_validation_split
...
seriousran commented 4 years ago

I gave up. In local, it works well. All pytest passed. But, the tests by travis have faild. I can't handle this problem.

However, the error can be solved by this. https://github.com/tensorflow/tensorflow/issues/8717#issuecomment-289828646

Dref360 commented 4 years ago

You need to remove dist: trusty in travis.yml

seriousran commented 4 years ago

@Dref360 thanks :) please review this! @rragundez

seriousran commented 4 years ago

@rragundez

I would appreciate it if you reconsider once again.

keras-team/keras#13637