I use instruct pix2pix to train a style transfer model, my dataset has about 8,000 image pairs and the instruction is just "change the style to ……". I found that the more steps the model trains, the worse the inference result to get and the result changes to irrelevant to the original image. Such as I want to change a photo to anime style, but I get a twisted figure.
Does anyone have similar question or know why?
I use instruct pix2pix to train a style transfer model, my dataset has about 8,000 image pairs and the instruction is just "change the style to ……". I found that the more steps the model trains, the worse the inference result to get and the result changes to irrelevant to the original image. Such as I want to change a photo to anime style, but I get a twisted figure. Does anyone have similar question or know why?