Open mzalaki00 opened 2 weeks ago
TeVNet is used to decompose the infrared images into T, e and V components, not used for generating infrared images. You can download the checkpoint of PID and modify the command like "python scripts/rgb2ir_vqf8.py xxxxxxx --checkpoint /path/to/PID_checkpoint".
Its output appears improved but yet deviates from the ground truth. I use yaml config --> flir512-vqf8.yaml with this content: model: base_learning_rate: 1.0e-06 target: ldm.models.diffusion.ddpm_tev.LatentDiffusion params: load_only_unet: True tevloss_weight_rec: 50 tevloss_weight_tev: 50 pixel_tev: true vnums: 4 linear_start: 0.0015 linear_end: 0.0205 log_every_t: 100 timesteps: 1000 loss_type: l1 first_stage_key: image cond_stage_key: conditional image_size: 64 channels: 4 concat_mode: true monitor: val/loss_simple_ema cond_stage_trainable: true unet_config: target: ldm.modules.diffusionmodules.openaimodel.UNetModel params: image_size: 64 in_channels: 7 out_channels: 4 model_channels: 128 attention_resolutions:
data: target: main.DataModuleFromConfig params: batch_size: 12 num_workers: 4 wrap: false train: target: ldm.data.FLIRv1512.FLIRTrain params: size: 512 validation: target: ldm.data.FLIRv1512.FLIRVal params: size: 512
Are these settings correct, particularly the two checkpoint paths specified above?
The used FLIR dataset is "Flir thermal dataset version 1.3". We recommend you to load the KAIST checkpoint to test the RGB images we provided.
Thanks but I don't care about version of Flir dataset, i talk about models and yaml file config. i used --config ./configs/latent-diffusion/kaist512-vqf8.yaml --ddim_eta 0.0 --checkpoint ./pretrained/PID_KAIST/epoch=000235-step=000059999.ckpt with PID_KAIST
in both yaml file we have tevnet checkpoint params not PID, i use every checkpoint but it doesn't make output better!
The used FLIR dataset is "Flir thermal dataset version 1.3". We recommend you to load the KAIST checkpoint to test the RGB images we provided.
Because your yaml file is about FLIR:
Its output appears improved but yet deviates from the ground truth. I use yaml config --> flir512-vqf8.yaml with this content: model: base_learning_rate: 1.0e-06 target: ldm.models.diffusion.ddpm_tev.LatentDiffusion params: load_only_unet: True tevloss_weight_rec: 50 tevloss_weight_tev: 50 pixel_tev: true vnums: 4 linear_start: 0.0015 linear_end: 0.0205 log_every_t: 100 timesteps: 1000 loss_type: l1 first_stage_key: image cond_stage_key: conditional image_size: 64 channels: 4 concat_mode: true monitor: val/loss_simple_ema cond_stage_trainable: true unet_config: target: ldm.modules.diffusionmodules.openaimodel.UNetModel params: image_size: 64 in_channels: 7 out_channels: 4 model_channels: 128 attention_resolutions: - 8 - 4 - 2 num_res_blocks: 2 channel_mult: - 1 - 4 - 8 num_head_channels: 8 first_stage_config: target: ldm.models.autoencoder.VQModelInterface params: ckpt_path: "./pretrained/vqf8_pretrained/model.ckpt" embed_dim: 4 n_embed: 16384 monitor: val/rec_loss ddconfig: double_z: false z_channels: 4 resolution: 256 in_channels: 3 out_ch: 3 ch: 128 ch_mult: - 1 - 2 - 2 - 4 num_res_blocks: 2 attn_resolutions: - 32 dropout: 0 lossconfig: target: torch.nn.Identity cond_stage_config: target: ldm.modules.encoders.modules.SpatialRescaler params: n_stages: 3 method: bicubic in_channels: 3 out_channels: 3 tev_net_config: target: ldm.modules.HADARNet.modules.HADARNet params: in_channels: 3 out_channels: 6 smp_model: Unet smp_encoder: resnet18 ckpt_path: "./pretrained/TeVNet_FLIR/epoch_1000.pth"
data: target: main.DataModuleFromConfig params: batch_size: 12 num_workers: 4 wrap: false train: target: ldm.data.FLIRv1512.FLIRTrain params: size: 512 validation: target: ldm.data.FLIRv1512.FLIRVal params: size: 512
Are these settings correct, particularly the two checkpoint paths specified above?
It seems that you mention TeVNet many times. But in our paper, we clearly claim that TeVNet is irrelevant with inference.
In our paper, we have released the quantitative results and qualitative results of different datasets. Please refer them.
I'm using two commands to get results from the model, but both produce images that don't resemble IR images. !!! i put rgb images is in indir or --image-dir
1.python scripts/rgb2ir_vqf8.py --steps 200 --indir ./image --outdir ./result --config ./configs/latent-diffusion/kaist512-vqf8.yaml --ddim_eta 0.0 --checkpoint ./pretrained/TeV Net_KAIST/epoch_950.pth
2.python ./TeVNet/test.py --image-dir ./image --output-dir ./result --smp_model Unet --smp_encoder resnet18 --vnums 4 --weights-file ./pretrained/TeVNet_KAIST/epoch_950.pth
What might be wrong with it??!?