Open ChristianEschen opened 1 year ago
Hi @MidoAssran, I'm currently facing the same issue, the training loss act very weird like above @ChristianEschen's chart, how could we resolve that? Best regards, Quan
Hi, has anybody been able to solve this yet?
Hello,
I'm experiencing challenges training the model on a custom dataset consisting of medical images.
Environment & Setup:
GPUs: 8 x A100 40GB Batch Size: 32 per GPU (Total: 256) Learning Rate: 0.001 Model: VIT huge patch size 14 Epochs: 300 Warmup: Set to 0 Weights Initialization: Loaded from the provided checkpoint in this repository Problem Description: When training with the mentioned setup, I'm observing that it's difficult to get the desired learning rate and other hyperparameters to work effectively. The loss goes up and the rankme and F1-score does not go up. Please see the evaluation schema below and the attached figures:
Training Metrics & Evaluation: Here's a brief outline of the evaluation metrics from the training:
Evaluation Methods: loss, Rankme. Classification Type: 3-class classification downstream task on validation set Metrics: F1 macro and accuracy (KNN
I'd appreciate any guidance or recommendations to help resolve this. Thank you for your assistance.
Best regards, Christian