dataset - Githubissues

PINK512 commented 1 year ago

Excuse me,I am a beginner of this. I can not run the code successfully.Could you please answer my questions?

1.where should I download the initial dataset in your paper?

2.In what order should I run the code?

I will continue to search on the Internet for answers. Thank you in advance.

huuquan1994 commented 1 year ago

Hello @PINK512 Sorry for my late reply! Here are the answers to your 2 questions:

The cucumber dataset that we used belongs to a national project so we can't share it. Sorry, but you'll need to work with your own dataset at the moment. Please note that if the dataset is not cucumber, you'll need to re-train the LFLSeg module.
Here is the order
- You first need to train the LFLSeg module (only if your dataset is not cucumber leaf). Refer to this: LFLSeg tutorial
- Set up the environment using the command: pip install -r requirements.txt
- Then arrange your dataset and train LeafGAN (refer to https://github.com/IyatomiLab/LeafGAN#readme)

Feel free to ask me anything!

PINK512 commented 1 year ago

Hi, @huuquan1994 Thank you for the answer. I still have some questions. 1.After I trained the LFLSeg module,should I run the code of train.py directly? 2.The picture in the folder of trainA and trainB is health pictures and disease pictures, right? 3.Is it correct to set the dataset folders as follow: H2B: trainA trainA_mask testA trainB trainB_mask testB 4.Is the image generated by prepare_mask.py the same as the output file generated in the LFLSeg module? I don't know which one is pretrain path. Thank you for your consideration.

huuquan1994 commented 1 year ago

@PINK512 Let me answer your questions one by one

After I trained the LFLSeg module, should I run the code of train.py directly?

Yes, if you've trained the LFLSeg module and prepared all the training data. You can start training with train.py

The picture in the folder of trainA and trainB is health pictures and disease pictures, right?

In our paper, trainA and trainB are healthy and disease images, respectively. But note that LeafGAN is based on CycleGAN and you can train your model to transfer on arbitrary domains. For example, based on your purposes, you can add disease images to the trainA folder if you want.

3.I s it correct to set the dataset folders as follow: H2B: trainA trainA_mask testA trainB trainB_mask testB

Yes, this is correct. Note that if you train your model with mask images (trainA_mask, trainB_mask), you don't need to load the LFLSeg module. Please refer to README for more details.

Is the image generated by prepare_mask.py the same as the output file generated in the LFLSeg module?

Yes, they are the same. To save time in training, it's recommended to use the LFLSeg module to segment the training leaf data beforehand. The --pretrain_path is the path to the LFLSeg trained module. If you're working with cucumber data, you can refer to the LFLSeg to download the pre-trained model.

PINK512 commented 1 year ago

Hi @huuquan1994 I use the tomato datasets. I have trained the train.py, but it has error as follow: I can get the rec_A, but can't get rec_B. And this is my folder of datasets: I put the health and disease images into trainA and trainB,respectively. I don't know what's wrong with it. Thanks in advance.

huuquan1994 commented 1 year ago

@PINK512 Correct me if I'm wrong but it seems to me that the code in leaf_gan_model.py has changed, isn't it? For LeafGAN (or CycleGAN-based methods), the rec_B & rec_A will be created when you call the forward function. Then the networks are trained by calling the backward_G & backward_D functions. Please refer to the original CycleGAN paper for more details!

As I see in your error logs, please check the forward function in the leaf_gan_model.py.

PINK512 commented 1 year ago

@huuquan1994 I had run the whole code . I used the unmasked pictures to train first, and then I used the masked. When I run the test.py with the new unmasked pictures, the result is as follow: Is there anything wrong here? Thank you in advance.

huuquan1994 commented 1 year ago

@PINK512 Thanks for your question! By looking at your result, I can't really tell what is wrong here. There are many factors that we need to consider. For example, how big is your dataset? How many epochs did you train the model?

At the first glance, seems to me that your model is not having enough training (but again, I'm not 100% sure).

Plus, I'd advise you to check and try different hyper-parameters!

PINK512 commented 1 year ago

Hi @huuquan1994 I have a question about training. I have trained 200 epochs with 1000 disease and 1000 health pictures. When I used unmasked pictures, the code can generate the mask and learn the feature from leaf. I can get a fake disease picture from health.

However, when I used masked pictures for training, the result is bad, it looks like that model cannot learn the feature.

Therefore, I confused about the difference between two training ways.

huuquan1994 commented 1 year ago

@PINK512

The mask images are supposed to be inputs of the discriminators only. The generators generate full images (without masking). This way, from the GAN loss, the generators will be forced to generate symptoms in the masked area only.

I see that the 3rd and 4th images above are masked which are not the right outputs of LeafGAN. The masked images are trained together with normal leaf images. To train with the masked images, make sure to include the --dataset_mode unaligned_masked in the command line.

Please also refer to our paper for more details!

PINK512 commented 1 year ago

I use the correct code. --dataset_mode unaligned_masked The 3rd image is the input and the 4th is the output of the generate. The generated leaf cannot show the lesion.

huuquan1994 commented 1 year ago

@PINK512

Masked images are the input of the discriminator D only, not the input of the generator G. If the images in your trainA and trainB (not in trainA_masked and trainB_masked) are full leaf images, the input of the generator must be full leaf images (not masked images). Please refer to Fig. 2b in our paper!

You mentioned the 3rd image is the input (masked version), which I assume might look different from the full leaf images in your training data !? (Note that this is just my assumption since the generated results also depend on many factors)

PINK512 commented 1 year ago

@huuquan1994 Hi! So far, leaves with disease spots can be generated, but the boundary between the disease region and the source region is very clear. Could you please instruct me where I should further modify the parameters, so as to make the picture more realistic？

huuquan1994 commented 1 year ago

@PINK512 Sorry for my late response! For this problem, I think there are two main reasons:

The leaf segmentation is not accurate enough to cover the leaf area.
The cycle-consistency loss term isn't tuned correctly.

I think you could try to increase the coefficient of the cycle-consistency loss (i.e., --lambda_A and --lambda_B), then check if it reduces your problem or not.

PINK512 commented 1 year ago

Excuse me, I am having a question about the batch size. I tried to choose other batch size, but it throw an error as below: I don't know how could I modify the code. Could you please help me with it?

huuquan1994 commented 8 months ago

@PINK512 Sorry for the late response! By default, CycleGAN/LeafGAN uses a batch size of 1. I haven't tried to write the code to train with batch size > 1 but I think it's possible if you modify the Image Pooling (image buffer) mechanism in CycleGAN. (See the details in CycleGAN paper, section 4. Implementation - Training details).

IyatomiLab / LeafGAN

dataset #9