Open yen52205 opened 3 years ago
Oh, I will check again, what type of classifier?? (audio, remi, magenta)
Thanks. it's magenta type. The results I mentioned (0.57/0.744/0.744) were computed by 'correctly classified numbers/ test clip numbers'. Is this the same way that you compute accuracy?
I re-check performance, but there is no performance decrease. I think that it is difference about global seed!
plz check & run https://github.com/SeungHeonDoh/EMOPIA_cls/blob/main/midi_cls/train_test.py with best hparams.yaml
or just simply add global seed in your script!
from pytorch_lightning import seed_everything
if args.reproduce:
seed_everything(42)
I re-check performance, but there is no performance decrease. I think that it is difference about global seed! plz check & run https://github.com/SeungHeonDoh/EMOPIA_cls/blob/main/midi_cls/train_test.py with best
hparams.yaml
or just simply add global seed in your script!
from pytorch_lightning import seed_everything if args.reproduce: seed_everything(42)
thanks!
I didn't set global seed. will the global seed setting influence the inference result? or it only influence training reimplementation?
I added global seed to both inference_batch.py and inference.py, but still got weird results. I just used the inference_batch.py with the best weight (readme.md) to inference on all the .mid clips, and used the csv produced by inference_batch.py itself to map 'dataset/split/test.csv', and computed how many clips were correctly classified. But still got the results 0.57,0.744,0.744 on AV/A/V separately.
Here is the dataset/split/test.csv and the csv produced by inference_batch.py. Could you please give me a check if there is anything I didn't notice? 1029_seed_arousal_all.csv 1029_seed_arva_all.csv 1029_seed_valence_all.csv test.csv
It's very weird. Could you follow Training from scratch
step? Not use inference_batch.py
preprocessing.py
train_test.py
I used inference_batch.py because I wanted to test the best weights you provided on EMOPIA dataset. Could I use train_test.py to do the same thing ? (only testing no training)
I wanna just double checking the result. It is strange that the results are different even when there are no other factors. I will check my inference code also!
tain_test1030.csv inference1030.csv
with best_weight, I found that the train_test.py and inference.py results were different. I think batch inference
and zero padding seems to have affected the performance. There are only 87 test samples, small differences affected the big results.
There is no problem with best weight. I will modify the inference code to train_test style soon.
with best_weight, I found that the train_test.py and inference.py results were different. I think
batch inference
and zero padding seems to have affected the performance. There are only 87 test samples, small differences affected the big results.There is no problem with best weight. I will modify the inference code to train_test method soon.
thanks a lot!! could you probably further explain the difference between two results after you modify this?
Hi, sorry to disturb. Did you find the problem that caused they different? Was zero padding interfered the result in inference_batch?
I tested inference_batch.py on dataset/split/test.csv and got 0.57,0.744,0.744 on AV/A/V separately. The models I used were download from https://drive.google.com/u/0/uc?id=1L_NOVKCElwcYUEAKp1-FZj_G6Hcq2g2c&export=download (which were provided in README.md).