Difficulty continue self supervised pre training on custom dataset

ChristianEschen commented 10 months ago

Hello,

I'm experiencing challenges training the model on a custom dataset consisting of medical images.

Environment & Setup:

GPUs: 8 x A100 40GB Batch Size: 32 per GPU (Total: 256) Learning Rate: 0.001 Model: VIT huge patch size 14 Epochs: 300 Warmup: Set to 0 Weights Initialization: Loaded from the provided checkpoint in this repository Problem Description: When training with the mentioned setup, I'm observing that it's difficult to get the desired learning rate and other hyperparameters to work effectively. The loss goes up and the rankme and F1-score does not go up. Please see the evaluation schema below and the attached figures:

Training Metrics & Evaluation: Here's a brief outline of the evaluation metrics from the training:

Evaluation Methods: loss, Rankme. Classification Type: 3-class classification downstream task on validation set Metrics: F1 macro and accuracy (KNN

I'd appreciate any guidance or recommendations to help resolve this. Thank you for your assistance.

Best regards, Christian

Spidartist commented 6 months ago

Hi @MidoAssran, I'm currently facing the same issue, the training loss act very weird like above @ChristianEschen's chart, how could we resolve that? Best regards, Quan

FalsoMoralista commented 3 months ago

Hi, has anybody been able to solve this yet?

facebookresearch / ijepa

Difficulty continue self supervised pre training on custom dataset #49