Zheng-Chong / CatVTON

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Other
735 stars 85 forks source link

Online DEMO vs Local inference.py performance differences #57

Open AI-P-K opened 1 day ago

AI-P-K commented 1 day ago

I am trying to replicate the same performance i get from your online demo, by installing locally CatVTON repository.

  1. Inference parameters: python inference.py --dataset 'vitonhd' --width 768 --height 1024

  2. agnostik-mask -> SCHP to generate initial legs mask and + preprocess_agnostic_mask.py dude2_mask

  3. cloth-mask -> SCHP to generate mask 3

  4. image_parse-v3 -> CIHP_PGN dude2

  5. openpose_img and openpose_json -> Openpose dude2_rendered dude2_keypoints.json

  6. Original images dude2 3

7 Local Output: 320dude2

  1. Online Output: converted

Can i kindly ask where the differences in output are coming from?

Zheng-Chong commented 1 day ago

Due to the lack of clarity on what code modifications you have made, it is impossible to identify the source of the differences from the limited information available. I suggest directly using Gradio to deploy the app for testing. The purpose of Inference.py is to evaluate datasets and metrics, not for application development environments.

AI-P-K commented 1 day ago

I did not do modifications to the code at all, i use exactly the repository provided. Using Gradio is not viable option for me.

Zheng-Chong commented 1 day ago

If the code is unmodified, Inference.py will not use the 1024 resolution model, even if specified through --width 768 --height 1024. This could be the reason for the different results. You can specify the use of the 1024 model in the code by setting the version parameter of CatVTONPipeline to 'mix'.

AI-P-K commented 18 hours ago

Once again you are correct... thank you for your help