keras-team / keras-preprocessing

Utilities for working with image data, text data, and sequence data.
Other
1.02k stars 444 forks source link

Suggest to enrich the function family for image_dataset_from_directory #299

Open L-Eriksson opened 4 years ago

L-Eriksson commented 4 years ago

HI, I'm a researcher and have been using keras (with tf backend) for a long time. ImageDataGenerator.flow_from_directory has been my favourite method due to my data size. However, I feel that Dataset is recommended more than Generator method after tf2.x and keras also have _image_dataset_fromdirectory function in tf.keras.preprocessing. So I switched to Dataset but met some inconvenience. For example, I can't find similar functions like previous _testgen.filenames and _testgen.class , which brought me some difficulty when I tried to match my predictions and file names. I finally managed it but not as neat as build-in functions like .filename.

Does the team have plan to enrich the functions? It would be very helpful. Thank you!

Dref360 commented 4 years ago

Hello, We have a Keras community SIG this Friday (June 5th). I'll raise your issue there.

L-Eriksson commented 4 years ago

@Dref360 Thank you! Shoud I expect feedback on this issue page or somewhere else?

Dref360 commented 4 years ago

Yes I will report back on this issue.

Dref360 commented 4 years ago

For your particular issue: class_names is still an attribute (as can be seen here).

You can get all the information from dataset_utils.index_directory.

I know this is not ideal.

Dref360 commented 4 years ago

From the Keras SIG: With Keras preprocessing Layers coming out of experimental, there will be a lot of documentation on how to transition from the old API to the new.

In addition, I submitted an issue on your behalf on the Tensorflow Github. I can make the change pretty easily, but I want to be sure that the design will be accepted. https://github.com/tensorflow/tensorflow/issues/40203

L-Eriksson commented 4 years ago

Thanks for your update. For the particular solutions: I have tried them and they work. As you also stated, it's not ideal but could be one way to solve the issue.

For your submission on TF Github: It's exactly what I mean and I also comment under that issue. I would wait for further information.