sjtuplayer / few-shot-diffusion

[ICCV 2023] Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption
50 stars 2 forks source link

No such file or directory: 'features1.npy' #1

Open zhouzheng1123 opened 12 months ago

zhouzheng1123 commented 12 months ago

when i run the 'train-whole.py', an error occurs:'No such file or directory: 'features1.npy''. May I know how to solve it?

sjtuplayer commented 12 months ago

features1.npy is the extracted features from CLIP model, which is used to calculate the Directional Distribution Consistency Loss. You can employ CLIP model to encode more than 1000 source images as features1.npy and encode the few-shot target images as features2.npy. Note that both features1.npy and features2.npy are num*dim (num is the number of encoded images and dim is the dimension of CLIP features)

zhouzheng1123 commented 12 months ago

features1.npy is the extracted features from CLIP model, which is used to calculate the Directional Distribution Consistency Loss. You can employ CLIP model to encode more than 1000 source images as features1.npy and encode the few-shot target images as features2.npy. Note that both features1.npy and features2.npy are num*dim (num is the number of encoded images and dim is the dimension of CLIP features)

Thank you for your answer, but I still don't know how to generate an npy file. Do you have any generated script programs in your code? Can you provide more detailed instructions for generating operations.

sjtuplayer commented 12 months ago

The code and readme are updated, you can run feature-extractor.py to encode the images

zhouzheng1123 commented 12 months ago

The code and readme are updated, you can run feature-extractor.py to encode the images

Thank you very much. This is a great work, and I have another question. Will the source and target images automatically generate the image after the source image migration style is trained together? I don't see any testing code, but I am currently training the code for the third train stage, which is a bit slow and hasn't generated any images yet.

zhouzheng1123 commented 11 months ago

Thank you very much for patiently answering my question. However, when I trained train-whole.py, I set epoch to 1. After training, no style transferred images were generated, only weight files. What should I do to generate style transferred images.

sjtuplayer commented 11 months ago

The code for inference will be released with the checkpoints soon, which are currently under preparation. If you want to generate images, you can temporarily use DDIM to sample the images

tonia86 commented 10 months ago

I am very interested in your work. When will the inference code be updated?

sjtuplayer commented 10 months ago

Hi~ Thanks for your attention. The inference code will be updated immediately after CVPR 2024.

Jamie-Cheung commented 9 months ago

The code for inference will be released with the checkpoints soon, which are currently under preparation. If you want to generate images, you can temporarily use DDIM to sample the images

May I ask if you can provide the inference code of DDIM mode, because I am submitting SCI papers recently and need to do a comparison experiment with your method. Hope you can provide, we will cite your paper.

sjtuplayer commented 9 months ago

Thanks for your patience. I've just finished the suppl of CVPR24 and I'll release the inference code of DDIM mode and the relevant ckeckpoints before Friday. And may I ask which domains you need the most? I will organize and release the relevant checkpoints first.

Jamie-Cheung commented 9 months ago

Thanks for your patience. I've just finished the suppl of CVPR24 and I'll release the inference code of DDIM mode and the relevant ckeckpoints before Friday. And may I ask which domains you need the most? I will organize and release the relevant checkpoints first.

I'd like you to provide checkpoint trained on Van Gogh firstly. Thank you.

sjtuplayer commented 9 months ago

The code is updated and some pre-trained models are offered. If you have any problem running the code, please feel free to contact us. (Note: the pre-trained models are newly trained by the updated code. Therefore, the results may be slightly different from those in the paper)

Jamie-Cheung commented 9 months ago

The code is updated and some pre-trained models are offered. If you have any problem running the code, please feel free to contact us. (Note: the pre-trained models are newly trained by the updated code. Therefore, the results may be slightly different from those in the paper)

Thank you for updating some pre-trained models. But when I use "python3 train.py --data_path=path_to_dataset" on my source dataset. It seems that the "train.py" needs to be load "/home/huteng/DDPM2/checkpoints/481157.pth". Thank you for your further response.

sjtuplayer commented 9 months ago

Thank you for updating some pre-trained models. But when I use "python3 train.py --data_path=path_to_dataset" on my source dataset. It seems that the "train.py" needs to be load "/home/huteng/DDPM2/checkpoints/481157.pth". Thank you for your further response.

You can just delete this line and train it from scratch

fikry102 commented 8 months ago

features1.npy is the extracted features from CLIP model, which is used to calculate the Directional Distribution Consistency Loss. You can employ CLIP model to encode more than 1000 source images as features1.npy and encode the few-shot target images as features2.npy. Note that both features1.npy and features2.npy are num*dim (num is the number of encoded images and dim is the dimension of CLIP features)

Is it necessary to encode all the source images from the source dataset? Or just choose 1000 source images randomly?

Jamie-Cheung commented 8 months ago

features1.npy is the extracted features from CLIP model, which is used to calculate the Directional Distribution Consistency Loss. You can employ CLIP model to encode more than 1000 source images as features1.npy and encode the few-shot target images as features2.npy. Note that both features1.npy and features2.npy are num*dim (num is the number of encoded images and dim is the dimension of CLIP features)

Is it necessary to encode all the source images from the source dataset? Or just choose 1000 source images randomly?

I only randomly choose partial images.

sjtuplayer commented 8 months ago

features1.npy is the extracted features from CLIP model, which is used to calculate the Directional Distribution Consistency Loss. You can employ CLIP model to encode more than 1000 source images as features1.npy and encode the few-shot target images as features2.npy. Note that both features1.npy and features2.npy are num*dim (num is the number of encoded images and dim is the dimension of CLIP features)

Is it necessary to encode all the source images from the source dataset? Or just choose 1000 source images randomly?

Actually, the more images you encode, the more accurate the source-domain center is. We have encoded about 10K-20K source-domain images in our experiments

15634960802 commented 4 months ago

我用推理阶段的代码生成的结果只是源数据集的图像加上了噪声,没有生成目标域的图像,这是什么原因啊。

boxbox2 commented 2 months ago

我用推理阶段的代码生成的结果只是源数据集的图像加上了噪声,没有生成目标域的图像,这是什么原因啊。

你可以试试 #11