CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2012.09841
MIT License
5.77k stars 1.14k forks source link

config files for conditional training on segmentation maps #17

Open ash80 opened 3 years ago

ash80 commented 3 years ago

Great paper! I am trying to retrain this model on an image dataset where I'm able to generate the segmentation masks using DeepLab v2. However, I don't have a config yaml file for training transformer as for faceHQ or D-RIN. Could you please provide a sample yaml file training with segmentation masks? Many Thanks

ink1 commented 3 years ago

I'm also interested in that, see #16 The best starting I could find is the yaml file shared with sflckr checkpoint. Replace validation by train at the end of the file. But my progress basically stops there.

attashe commented 3 years ago

I am trying to train this on Flicklr-30k dataset and after 21 epochs intermediate results nothing changed. Config from sflckr checkpoint. Снимок экрана 2021-02-04 в 19 38 57 Снимок экрана 2021-02-04 в 19 39 58 Снимок экрана 2021-02-04 в 19 40 07

akmtn commented 3 years ago

Distributed sflckr.yaml seems insufficient for training, because some settings are lacking. For example, models.params.lossconfig

Hi, authors, Could you please provide a sample yaml file training with segmentation masks? Thanks.

pesser commented 3 years ago

Added config and loss to train cond stage on segmentation maps (configs/coco_cond_stage.yaml and configs/sflckr_cond_stage.yaml). Optionally, you can also extract weights of cond stage from the transformer checkpoints,

python scripts/extract_submodel.py logs/2021-01-20T16-04-20_coco_transformer/checkpoints/last.ckpt coco_cond_stage.ckpt cond_stage_model

and fine-tune from there (maybe adjust data section of config):

python main.py --base configs/coco_cond_stage.yaml -t True --gpus 0, model.params.ckpt_path=coco_cond_stage.ckpt
kampta commented 3 years ago

I have a related question. It looks like the current config file sflckr_cond_stage.yaml leads to resizing the image to SmallestMaxSize=256. So the model was essentially trained using smaller (resized) images. Is the model checkpoint provided trained using the same config? I'd imagine in order to sample high res images, we need to just crop the images without resizing.

ali-design commented 3 years ago

Thank you for the great effort! How can I train the conditional transformer when I would like to condition image on a vector (as opposed to depth map or class label)? Specifically how the config would look like?

Kai-0515 commented 2 years ago

Added config and loss to train cond stage on segmentation maps (configs/coco_cond_stage.yaml and configs/sflckr_cond_stage.yaml). Optionally, you can also extract weights of cond stage from the transformer checkpoints,

python scripts/extract_submodel.py logs/2021-01-20T16-04-20_coco_transformer/checkpoints/last.ckpt coco_cond_stage.ckpt cond_stage_model

and fine-tune from there (maybe adjust data section of config):

python main.py --base configs/coco_cond_stage.yaml -t True --gpus 0, model.params.ckpt_path=coco_cond_stage.ckpt

hello How can I sample from a segmentation model? segmentation model can't sample as readme, because VQSegmentationModel has no attribute 'encode_to_z' but make_samples.py use it looking forward to your reply

Kai-0515 commented 2 years ago

I am trying to train this on Flicklr-30k dataset and after 21 epochs intermediate results nothing changed. Config from sflckr checkpoint. Снимок экрана 2021-02-04 в 19 38 57 Снимок экрана 2021-02-04 в 19 39 58 Снимок экрана 2021-02-04 в 19 40 07

hello could please tell me how u sample from segmentation model. I sample as the readme failed, because VQSegmentationModel has no attribute 'encode_to_z'