Closed xuxiran closed 6 months ago
I used the Sleep Cassette Study portion, which has the following sample size. It works fine for me.
Thank you for your timely response!
I have checked the sample size and is the same as you mentioned in the last comment. The sample size ([113369, 7953, 27593, 5537, 10126]) I mentioned yesterday is corresponded to half in the training dataset (train_X += X[:len(X)//2 + 1], train_X_aux += X[-len(X)//2 - 1:]).
Intuitively, without balancing the labels, and with loss1 being the cross-entropy loss, the model is likely to directly predict class 0 rather than learning the features properly, especially when class 0 accounts for nearly 70% of the data.
Maybe I made some mistakes? Thank you very much again.
Hello, owner. I was also inspired by this job. I wanted to reproduce your results, but ended up getting the same results as the author of this issue. Is there a problem with the data? Otherwise, the experiments were run as provided. I'd like to know what the problem is. Thank you.
Hello @xuxiran @sonheesoo, I am so sorry to reply late. The issue of the following result is not because of the imbalanced dataset. The issue is in my STFT code (before releasing the code, I cleaned it up and somehow caused this additional wrap).
Please change the code in model.py #181
from
return torch.clip(torch.log(torch.clip(signal, min=1e-8)), min=0)
into
return torch.log(torch.clip(signal, min=1e-8))
and everything works out automatically.
Thank you very much for your codes. Sorry for that I have done other work these days, so I spent so much time to reply. You are right, the code works well now. I hope my reply help other people. I will further study this work.
Great! Let me close this issue then.
What a remarkable job! However, I have some questions regarding the sleep task. Perhaps due to my negligence, I have not found a method to balance the category labels. I noticed that the five categories have the following distribution [113369, 7953, 27593, 5537, 10126]. This causes the model to output all zeros when running the current code. I am looking for a solution to this issue. What steps should I take? Thank you very much!
I downloaded the dataset from "https://www.physionet.org/content/sleep-edfx/1.0.0/" and used the "wget -r -N -c -np https://physionet.org/files/sleep-edfx/1.0.0/".
I have attempted some commonly used methods to address the issue of imbalanced categories, such as adjusting the weights of CrossEntropyLoss and using WeightedRandomSampler with replacement to perform resampling. However, these methods do not seem to solve the problem, as the model quickly converges to a specific class other than the imbalanced ones.
Maybe I made some mistakes?