Open MrWan001 opened 1 year ago
Hello MrWan001,
DEFAULT_EMBED_PATH
in the code points to the locations of fine-tuned tokens from text inversion.
We will be releasing the tokens for the three datasets we evaluated on shortly, which you can download and place in the location specified by DEFAULT_EMBED_PATH
. Which datasets are you using, or are you using a custom dataset?
Best, Brandon
Hello @brandontrabucco ,
I'm dealing with a similar issue. Could you please explain the difference between DEFAULT_EMBED_PATH
and DEFAULT_SYNTHETIC_DIR
? That is, if the former is intended to point to the text inversion tokens as you've said, then what should the latter point to? Asking because intuitively those two variables seem to mean the same thing.
Also, I followed your instructions here and now have learned_embeds.bin
for each COCO class (.bin files are in coco-#-#/{class}/
for every class). How should I format the relevant parameters if I want to run train_classifier.py
? The given template for DEFAULT_EMBED_PATH
seems class-agnostic...
Best, jl
Hello jlsaint,
Thanks for following up on this issue! The parameter DEFAULT_SYNTHETIC_DIR
points to a location on the local disk of your machine where augmented images from Stable Diffusion will be saved for caching. This serves two roles:
First, caching the images to the disk means they don't have to be stored in memory, and for datasets with many images / classes, this can be crucial if there are too many augmented images. Note based on this point that we generated the augmented images only once at the beginning of training in our example training scripts using the train_dataset.generate_augmentations(num_synthetic)
method. Here train_dataset
is an instance of our FewShotDataset
and num_synthetic
is an integer that controls how many synthetic images we generate from Stable Diffusion for each real image. If you want to generate synthetic images more than once, you need only call generate_augmentations
again later in the script.
Second, having the images cached means you can inspect the augmented images for tuning hyperparameters and confirming if the DA-Fusion is working as expected.
For your last point, take a look at this script: https://github.com/brandontrabucco/da-fusion/blob/main/aggregate_embeddings.py
After we do text inversion and have several class-specific tokens, we merge them together into a single dictionary containing all the tokens using the above script. This produces a class-agnostic
template for the DEFAULT_EMBED_PATH
.
Let me know if you have other questions!
Best, Brandon
Hello @brandontrabucco, Is it possible to share the fine-tuned tokens from text-inversion for the three datasets?
I am hoping to run it for Imagenet. Thanks.
Sure! We have uploaded the current set of tokens here: https://drive.google.com/drive/folders/1JxPq05zy1_MGbmgHfVIeeFMjL56Cef53?usp=sharing
Thank you very much.
DEFAULT_EMBED_PATH = "/root/downloads/da-fusion/dataset)-tokens/(dataset)-{seedJ-(examples_per_class}.pt"
Hello,the. pt file cannot be found. What effect does it have on the program?