Open websitefingerprinting opened 2 months ago
It's fine, there is a extra head during pretraining, which is not used for finetuning.
Thank you for your prompt response. One more question if you have any idea about it:
Few-shot finetuning didn't yield good results for my classification task. I pre-trained and finetuned using my own datasets. With only 5% of the data, accuracy was very low, but it improved with 100% data. However, pretraining didn't offer any advantage compared to supervised training.
I append my pretrain train loss below.
Does the pretrain loss look correct?
Many thanks! (It is fine if you have no idea, since I may make a mistake in this process or my problem may not fit in.)
I am not so sure. The loss looks pretty large. Have you tried only do the prompt tuning with the pretrained model? If it don't get a reasonable performance, it means the pretraining is not working well.
Thank you again for your suggestion.
My data is very different from the ones in your paper. My input is a sequence of unnormalized integers (i.e., $x_i \in N^{Length \times Dim}$). I guess that may be the reason why the reconstructed loss is large. So, I should normalize the data before input to the transformer?
Nice work and I have a question here:
I am trying to pre-train and finetune a model on my own datasets. However, some warnings were raised when loading the pre-trained model during finetuning:
Is everything correct here?
Thank you for your help!