Closed Zarxrax closed 1 year ago
Nevermind, I had somehow missed the info at the very bottom to download your actual dataset. It's starting to make sense now. I am putting together a new dataset to try to enhance the model even further. Would you be interested in working with me to train it?
@SkyTNT would you be able to explain a bit more about how the dataset works? I have finally gotten around to downloading the full thing and exploring the code.
My understanding is that the dataset is generated on the fly by compositing the BG and FG images with some augmentations. Really quite clever! But, if that is the case, then what is the purpose of the imgs & masks folders? Those masks are of extremely poor quality, so I'm not sure what they are supposed to be for.
@SkyTNT would you be able to explain a bit more about how the dataset works? I have finally gotten around to downloading the full thing and exploring the code.
My understanding is that the dataset is generated on the fly by compositing the BG and FG images with some augmentations. Really quite clever! But, if that is the case, then what is the purpose of the imgs & masks folders? Those masks are of extremely poor quality, so I'm not sure what they are supposed to be for.
The imgs & masks folder contains manually labeled images. But when I try to train using only the generated dataset, the accuracy of the model drops. I guess the generated datasets deviate from the real ones, so I kept them. According to my training, convert to sketch
data augmentation seems to be very helpful to improve the accuracy. But my convert to sketch
implementation should still need to be improved to make it closer to the real sketch. I also found that compositing and data augmentation at low resolutions can make image detail look unnatural. But doing it at high resolution will be slow. If your cpu is powerful enough, you can increase the resolution of DatasetGenerator, and add RescalePad(image_size) at transform_generator.
https://github.com/SkyTNT/anime-segmentation/blob/c98f338ce402a1e7bfccb93b66c9626c741b28c5/data_loader.py#L272-L273
Thanks, that is helpful to know. I will experiment with the augmentations, and I think I can put together an alternate dataset of the manually labeled images.
Currently I have been able to execute the training code, but I am running into a problem with export.py. RuntimeError: Error(s) in loading state_dict for AnimeSegmentation
Do you know what might cause this?
Can you provide more detailed information?
Oh, it was my mistake. I didn't realize the training was 2 steps (gt encoder, then the 2nd training), so I did not train long enough to be able to export.
After updating pytorch version to cuda 11.7, now I get an error when trying to train:
File "D:\Downloads\anime-segmentation\train.py", line 113, in training_step loss0, loss = self.net.compute_loss(loss_args) TypeError: ISNetGTEncoder.compute_loss() missing 1 required positional argument: 'targets'
Is it a bug with using cuda 11.7 pytorch? Shall I need to downgrade?
Did you modified ISNetGTEncoder.compute_loss? https://github.com/SkyTNT/anime-segmentation/blob/2373527d745755fbe2987ba146d9326fed8e8881/model/isnet.py#L431-L434
Thank you, it had changed somehow, so I was able to download the file again and it works. I am not sure what happened, but maybe I accidentally copied the file from the original DIS repo. I was looking at the code all day, so I probably made stupid mistakes.
If I want to write the generated dataset images to files on my disk, can you recommend where would be the best place to insert the code for that? I would like to see the images so I can tweak it more easily.
You can insert the code at https://github.com/SkyTNT/anime-segmentation/blob/2373527d745755fbe2987ba146d9326fed8e8881/dataset_generator.py#L311-L312 with argparse.ArgumentParser
A few more questions, if you don't mind.
Thanks!
Hi, could you provide more detail about how you put together the dataset and trained the model? I am interested in trying to train it, but am confused about the structure of the dataset.