Can you provide insights into the training process？

KANGX99 commented 4 months ago

Hi, thanks for your great work. In the train_unique_clip_weight.py script(as shown in the image below), it appears that a single epoch doesn't utilize the entire training dataset. For example, when num_steps_per_epoch is set to 200, each batch consists of only 16 images from the Kadid-10k training set. Therefore, within a single epoch, only 3200 images (200 steps * 16 images/step) are used for training, while the Kadid-10k training set contains 7125 images. Is this a training trick? Because of the existence of shuffle in the parameter list of dalaloader.

zwx8981 commented 4 months ago

Hi, because different datasets are with different sizes, and we explictly sample images from all datasets during training. Due to the shuffling operations, we won't miss any training image as long as we use an enough large number of epoch.

Best

zwx8981 @.***

------------------ 原始邮件 ------------------ 发件人: "zwx8981/LIQE" @.>; 发送时间: 2024年5月8日(星期三) 下午5:56 @.>; @.***>; 主题: [zwx8981/LIQE] Can you provide insights into the training process？ (Issue #21)

Hi, thanks for your great work. In the train_unique_clip_weight.py script(as shown in the image below), it appears that a single epoch doesn't utilize the entire training dataset. For example, when num_steps_per_epoch is set to 200, each batch consists of only 16 images from the Kadid-10k training set. Therefore, within a single epoch, only 3200 images (200 steps * 16 images/step) are used for training, while the Kadid-10k training set contains 7125 images. Is this a training trick? Because of the existence of shuffle in the parameter list of dalaloader. image.png (view on web)

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

KANGX99 commented 4 months ago

OK, thank you for your explanation.

zwx8981 / LIQE

Can you provide insights into the training process？ #21