Zheng-Chong / CatVTON

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Other
964 stars 114 forks source link

Online DEMO vs Local inference.py performance differences #57

Closed AI-P-K closed 2 months ago

AI-P-K commented 2 months ago

I am trying to replicate the same performance i get from your online demo, by installing locally CatVTON repository.

  1. Inference parameters: python inference.py --dataset 'vitonhd' --width 768 --height 1024

  2. agnostik-mask -> SCHP to generate initial legs mask and + preprocess_agnostic_mask.py dude2_mask

  3. cloth-mask -> SCHP to generate mask 3

  4. image_parse-v3 -> CIHP_PGN dude2

  5. openpose_img and openpose_json -> Openpose dude2_rendered dude2_keypoints.json

  6. Original images dude2 3

7 Local Output: 320dude2

  1. Online Output: converted

Can i kindly ask where the differences in output are coming from?

Zheng-Chong commented 2 months ago

Due to the lack of clarity on what code modifications you have made, it is impossible to identify the source of the differences from the limited information available. I suggest directly using Gradio to deploy the app for testing. The purpose of Inference.py is to evaluate datasets and metrics, not for application development environments.

AI-P-K commented 2 months ago

I did not do modifications to the code at all, i use exactly the repository provided. Using Gradio is not viable option for me.

Zheng-Chong commented 2 months ago

If the code is unmodified, Inference.py will not use the 1024 resolution model, even if specified through --width 768 --height 1024. This could be the reason for the different results. You can specify the use of the 1024 model in the code by setting the version parameter of CatVTONPipeline to 'mix'.

AI-P-K commented 2 months ago

Once again you are correct... thank you for your help