Closed song-wensong closed 9 months ago
Hi Wensong, why don't you join our MedARC Discord community? https://discord.gg/pMhsYwJUQu
I'd be happy to play a role offering advice/suggestions on research directions extending on our MindEye work if you're open to collaboration. The MindEye3 channel would be a good place to post questions and get feedback from the larger group of people who are involved in this research
Otherwise to answer your question I don't think there is a simple fix to the overfitting concern. But depending on the specifics of what you're going for I could lend more targeted feedback
Hi Wensong, why don't you join our MedARC Discord community? https://discord.gg/pMhsYwJUQu
I'd be happy to play a role offering advice/suggestions on research directions extending on our MindEye work if you're open to collaboration. The MindEye3 channel would be a good place to post questions and get feedback from the larger group of people who are involved in this research
Otherwise to answer your question I don't think there is a simple fix to the overfitting concern. But depending on the specifics of what you're going for I could lend more targeted feedback
Hi Paul,
Thank you for your invitation to the MedARC Discord community. I'm pleased to inform you that I've joined the community.
Coming to the issue that I've noticed during the training of MindEye, specifically when one-third of the epochs are completed, values of train/fwd_percent_correct
and train/bwd_percent_correct
abruptly near 1. This behavior is observed when the loss function is changed from utils.mixco_nce
to utils.soft_clip_loss
. I was wondering if you've encountered a similar issue during your training, and if so, what might be the potential reasons for it?
Glad to hear you joined, to reiterate, the MindEye3 channel in the Discord would be the place to post questions and get feedback from the larger group of people who are involved in this research
I've been working on training this model on a new dataset using two approaches: diffusion prior with image contrastive learning(original code) and diffusion prior with text contrastive learning(new try). It's been a challenging process and I've encountered some issues that I'm hoping you can provide some guidance on. When I evaluate the model using part of the training set as a validation set, the model seems to perform fairly well with satisfactory reconstruction results. However, I've noticed substantial issues when the model encounters data that it hasn't seen during training. When this happens, the reconstruction results are poor. It appears that the model might be overfitting to the training data. I would be immensely grateful for any advice or suggestions you might have for combatting this overfitting issue. Are there specific strategies or techniques that you'd recommend in this scenario?