menyifang / DCT-Net

Official implementation of "DCT-Net: Domain-Calibrated Translation for Portrait Stylization", SIGGRAPH 2022 (TOG); Multi-style cartoonization
Apache License 2.0
734 stars 74 forks source link

Training Clipart-Style tooth or teeth issues #42

Open aLohrer opened 1 year ago

aLohrer commented 1 year ago

Hi, congrats on the great paper!

I want to try to port this nice work to a mobile. But before getting started with performance I tried to reproduce the results.

As suggested I went with one of the SD Sytsles as it seemed easy to generate data. I tried the clipart style.

Here is an example of a generated clipart image image

they all look pretty good. Afterwards I went on to genreate samples via stylegan2. image

I realized that the generated cartoon samples of Stylegan are only 256x256 is that an issue ?

Anyway, next step is training the texture translator, starting with anime model as initial weights.

Iteration0 (basically anime style) image

Iteration 1000 image

Iteration 10000 image

Iteration 30000 image

Iteration 100000 image image

Here are the loss curves image

From the images I saw so far, it really is catching the style nicely. But it has a major problem with teeth. Unfortanetly thats quiet an important facial part.

My question is:

I am happy to test any different style to validate the training process, if you can point me to a dataset I should use and some intermediate results which are expected to be achieved.

Bonus Question - this is just loud thinking:

As my final goal is to get something really performant. I would like to switch out the Unet by a mobilenet v3. I am currently not sure if a mobilenet can pick up the unsupervised training signal or if it would be better train Unet first and use a teacher / student approach to transfer the training resuluts to a mobilenet in a supervised training fashion. Did you test out different architectures forthe texture translation block ?

Sorry for the many questions, but its such an interesting work I could ask 100 more (but I wont, promised :crossed_fingers: )

aLohrer commented 1 year ago

Ok, I went to through the training code once more and realized that the facial perception loss is not implemented.

Might this cause the above mentioned issues ? Will you release the source code for the facial perception loss ?

huimlight commented 1 year ago

May I ask if you are single card training or multi-card training, currently I use single card training, the effect is very poor.

huimlight commented 1 year ago

@aLohrer May I ask if you are single card training or multi-card training, currently I use single card training, the effect is very poor.

aLohrer commented 1 year ago

Single Card,

can you share your results for comparison ?

I think its mostly a problem caused by the missing facial perception loss in my case.

h3clikejava commented 7 months ago

I use single card Nvidia 4090, train 71h, 300k step. the effect is very poor.

  299999_face_result   296999_face_result   295999_face_result

image

9527-csroad commented 4 months ago

Hi, @aLohrer . I meet the same question, and I also found that there is no Facial perception loss in the code. Do you address this problem? If so, Could you share your found? Thanks, hope you have a good day.