Open machoangha opened 2 weeks ago
Hi,
I have fine-tuned my model using
supervised
mode on my custom data. However, when I switch toselfsupervised_kmeans
and add the mask file, I notice that the output shapes of the data from thetrain_data_loader_iter.next()
method are inconsistent with those from the supervised mode.Observations:
Supervised Mode Output:
- First value of item: Size:
torch.Size([3, 32, 128])
- Second value of item: Size:
torch.Size([1, 25])
Self-Supervised KMeans Mode Output:
- First value of item: Size:
torch.Size([3, 3, 32, 128])
- Second value of item: Size:
torch.Size([32, 128])
- Third value of item: Size:
torch.Size([3, 3])
Context: The size of each mode is printed in the training script at this line:
. Questions:
- In the paper, it seems to mention using Self-Supervised learning by creating 2 additional augmented images to form a batch of 3
torch.Size([3, 3, 32, 128])
, where the second's size is the masktorch.Size([32, 128])
and the final is the affine matrixtorch.Size([3, 3])
. Therefore, I believe this is not compatible with the current training script.- Could you please provide the fine-tuning code for
selfsupervised
mode?Thank you!
We used only the supervised mode in the fine-tuning file, you can modify this file to suit your needs.
Hi, I have fine-tuned my model using
supervised
mode on my custom data. However, when I switch toselfsupervised_kmeans
and add the mask file, I notice that the output shapes of the data from thetrain_data_loader_iter.next()
method are inconsistent with those from the supervised mode. Observations:
Supervised Mode Output:
- First value of item: Size:
torch.Size([3, 32, 128])
- Second value of item: Size:
torch.Size([1, 25])
Self-Supervised KMeans Mode Output:
- First value of item: Size:
torch.Size([3, 3, 32, 128])
- Second value of item: Size:
torch.Size([32, 128])
- Third value of item: Size:
torch.Size([3, 3])
Context: The size of each mode is printed in the training script at this line: https://github.com/TongkunGuan/CCD/blob/543109a1e1d9acd15080abb3e4e72d68588ba493/train_finetune.py#L269
. Questions:
- In the paper, it seems to mention using Self-Supervised learning by creating 2 additional augmented images to form a batch of 3
torch.Size([3, 3, 32, 128])
, where the second's size is the masktorch.Size([32, 128])
and the final is the affine matrixtorch.Size([3, 3])
. Therefore, I believe this is not compatible with the current training script.- Could you please provide the fine-tuning code for
selfsupervised
mode?Thank you!
We used only the supervised mode in the fine-tuning file, you can modify this file to suit your needs.
Hi,
I would like to confirm if my understanding is correct: During the pretraining phase of CCD, the model uses the self-supervised mode. However, when fine-tuning the model for a specific task like text recognition, you switches to using the supervised mode. Is that correct?
Thank you!
Hi,
I have fine-tuned my model using
supervised
mode on my custom data. However, when I switch toselfsupervised_kmeans
and add the mask file, I notice that the output shapes of the data from thetrain_data_loader_iter.next()
method are inconsistent with those from the supervised mode.Observations:
Supervised Mode Output:
torch.Size([3, 32, 128])
torch.Size([1, 25])
Self-Supervised KMeans Mode Output:
torch.Size([3, 3, 32, 128])
torch.Size([32, 128])
torch.Size([3, 3])
Context: The size of each mode is printed in the training script at this line: https://github.com/TongkunGuan/CCD/blob/543109a1e1d9acd15080abb3e4e72d68588ba493/train_finetune.py#L269.
Questions:
torch.Size([3, 3, 32, 128])
, where the second's size is the masktorch.Size([32, 128])
and the final is the affine matrixtorch.Size([3, 3])
. Therefore, I believe this is not compatible with the current training script.selfsupervised
mode?Thank you!