Open Laubeee opened 10 months ago
Hello! I have also been struggling to replicate the performance reported on the MARBLE Benchmark, but on the MTG dataset tasks (Mood, Genre and Instrument). I also tried to use the same setup as the MARBlE repository, which is very similar to the one described by @Laubeee
Hi, I am interested in reproducing the numbers you reported on NSynth. With the models from HuggingFace I do get close, but not quite to what you report (0.4 - 0.8 lower for the models I tried, which are 330M, 95M-public and data2vec). May I ask, did you use the settings in the MARBLE-Benchmark repository to achieve these numbers? (i.e. train one hidden layer of 512 units and 128 outputs, for max 50 epochs with early stopping and LR reduction, batch size 64, 5 runs with different LR)