AlexOlsen / DeepWeeds

A Multiclass Weed Species Image Dataset for Deep Learning
https://www.nature.com/articles/s41598-018-38343-3
Apache License 2.0
204 stars 86 forks source link

Can u Help Me Type Error #2

Open AhmetBAGBARS opened 5 years ago

AhmetBAGBARS commented 5 years ago

sikinti Good luck with. I have encountered such an error while instructing the model according to your instructions. I get a Type error in "train_data_generator" in "deepweeds.py". How can I solve this error? If you help. I'm glad. Thank you good work.

qiqi17 commented 5 years ago

Can you solve this error? I get a commin error. If you help. I'm glad. Thank you good work.

linaashaji commented 5 years ago

Hello, did you get any solution for that?

AlexOlsen commented 5 years ago

Hi all. Sorry for the (very) late response. Unfortunately, I am unable to replicate this error. My first suggestion is to ensure you have the identical version of Keras and Pandas, which successful use of the "flow_from_dataframe" function may depend on.

NegarTavakoli commented 5 years ago

Hi Alex, Thanks for sharing your work. I have created a new environmet with identical versions but I still get the same error. Any other suggestion for us?

Atached you can find the error and version of Keras and Panda:)

Capture1 Capture2

Arne-van-Au commented 5 years ago

Hey guys, i had the same problems and it was extremely strange to find a workaround. The actual reason for the error message I couldn't find, there must have been something changed in the newer versions of keras_preprocessing and thus in dataframe_iterator.py, but I can't say what.

The following changes work for me: To the existing CLASSES of type list, I added a CLASSES_str also of type list. CLASSES_str = ['0', '1', '2', '3', '4', '5', '6', '7', '8']

Then I read the dataframes for train, val and test with dtype=str train_dataframe = pd.read_csv(train_label_file,dtype=str) val_dataframe = pd.read_csv(val_label_file,dtype=str) test_dataframe = pd.read_csv(test_label_file,dtype=str)

This prevents the error message: "TypeError: If class_mode="categorical", y_col="label" column values must be type string, list or tuple." but leads to a new error: "Found 0 validated image filenames belonging to 9 classes."

I don't know why, all the tutorials I could find suggested that "train_dataframe" was correct, some recommended using absolute filenames. That's why I did: I set the IMG_DIRECTORY to an absolute path, eg.

IMG_DIRECTORY = "D:/DeepWeeds/images/"

and then

train_dataframe['Filenamefull'] = IMG_DIRECTORY+train_dataframe['Filename'] val_dataframe['Filenamefull'] = IMG_DIRECTORY+val_dataframe['Filename'] test_dataframe['Filenamefull'] = IMG_DIRECTORY+test_dataframe['Filename']

An finally I called train_data_generator.flow_from_dataframe as followed: train_data_generator = train_data_generator.flow_from_dataframe( train_dataframe, IMG_DIRECTORY, x_col='Filenamefull', y_col='Label', target_size=RAW_IMG_SIZE, batch_size=BATCH_SIZE, has_ext=True, classes=CLASSES_str, class_mode='categorical')

Here I put the predefined list of strings into classes=CLASSES_str. That works.

AlexOlsen commented 5 years ago

Thanks @Arne-van-Au. Does this fix work for you, @NegarTavakoli?

Arne-van-Au commented 5 years ago

Thanks Alex for pushing. I hope my raw kind of coding is not too terrible. :-)

galadash commented 3 years ago

Same issue encountered, I cannot find the changelog of Keras to see when they changed this behaviour, but indeed they do not accept the data in the form presented in this repo anymore:

if class_mode is "categorical" (default value) it must include the y_col column with the class/es of each image. Values in column can be string/list/tuple if a single class

source: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator#flow_from_dataframe

What I did is similar to what Arne proposes, however I decided to use the full category names, instead of stringified indices:

train_dataframe = pd.read_csv(train_label_file)
val_dataframe = pd.read_csv(val_label_file)
test_dataframe = pd.read_csv(test_label_file)
train_df.Label = train_df.Label.apply(lambda x: CLASS_NAMES[x])
val_df.Label = val_df.Label.apply(lambda x: CLASS_NAMES[x])
test_df.Label = test_df.Label.apply(lambda x: CLASS_NAMES[x])

later, when calling flow_from_dataframe(), you need to change the classes parameter to CLASS_NAMES AND you need to remove the has_extparameter, because it has been deprecated as well:

test_data_generator = test_data_generator.flow_from_dataframe(
    test_dataframe,
    IMG_DIRECTORY,
    x_col="Filename",
    y_col="Label",
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    shuffle=False,
    classes=CLASS_NAMES,
    class_mode='categorical')

note you need to change this 3 times for each dataframe.

Finally, if needed, you can call training_generator.class_indices if you need the translation dictionary from index to label name. Take care that the order of the labels has been changed, as they are sorted alphabetically. If the order is important, I suggest sticking with Arne's solution above.

cq2019git commented 3 years ago

unzip these zips is the way to solve the problem.