mimbres / YourMT3

multi-task and multi-track music transcription for everyone
GNU General Public License v3.0
100 stars 3 forks source link

Empty transcription when use 'YPTF+Multi (PS)' model in Colab Demo #3

Closed herve-ves closed 3 months ago

herve-ves commented 3 months ago

Hi, Thank you for your time. Following the title, is that the code introduces a mismatched MOE checkpoint?

elif model_name == "YPTF+Multi (PS)":
    checkpoint = "mc13_256_g4_all_v7_mt3f_sqr_rms_moe_wf4_n8k2_silu_rope_rp_b80_ps2@model.ckpt"
    args = [checkpoint, '-p', project, '-tk', 'mc13_full_plus_256',
            '-dec', 'multi-t5', '-nl', '26', '-enc', 'perceiver-tf',
            '-ac', 'spec', '-hop', '300', '-atc', '1', '-pr', precision]

Here is the result I got from Colab. I also have tried to restart the runtime and reproduced this.

screenshot
mimbres commented 3 months ago

@herve-ves No it doesn't make sense! What sample or example audio did you use?

By the way, did you modify the code? In your code snippet, it lacks options for residual connections and MoE. From my curiosity, why are you trying to remove options? The checkpoint should be loaded as it was trained. Otherwise, you will get the weird output (empty output in your case)...

Original code:

elif model_name == "YPTF.MoE+Multi (PS)":
    checkpoint = "mc13_256_g4_all_v7_mt3f_sqr_rms_moe_wf4_n8k2_silu_rope_rp_b80_ps2@model.ckpt"
    args = [checkpoint, '-p', project, '-tk', 'mc13_full_plus_256', '-dec', 'multi-t5',
            '-nl', '26', '-enc', 'perceiver-tf', '-sqr', '1', '-ff', 'moe',
            '-wf', '4', '-nmoe', '8', '-kmoe', '2', '-act', 'silu', '-epe', 'rope',
            '-rp', '1', '-ac', 'spec', '-hop', '300', '-atc', '1', '-pr', precision]
herve-ves commented 3 months ago

Hi mimbres,

Thanks for your reply. I didn't modify the code, I used the NO MoE type of model which names 'YPTF+Multi (PS)' instead. The options code I posted in Markdown is copied from the Colab Notebook. image

I have tried three examples(Slakh_test_1884.wav, MAPS_MUS-scn15_11_ENSTDkAm.wav, musicnet_2628.wav), and then all outputed MIDIs I got were empty.

The exact reproduce steps are:

  1. Open the Colab Demo from the link of README.md;
  2. Connect the runtime and run the code blocks in order;
  3. At the sixth block(Load Checkpoint), set model_name to 'YPTF+Multi (PS)' and precision to '16';
  4. After the seventh block(Run GradIO) invoke, in the Gradio frame, selece any example wave and click 'Transcribe';
  5. Finally, the weird output shows.
mimbres commented 3 months ago

@herve-ves Sorry, I forgot to update the model checkpoint! Thanks for finding the bug! I've now corrected the checkpoint name, so it should work fine now! 🙏

elif model_name == "YPTF+Multi (PS)":
    checkpoint = "mc13_256_all_cross_v6_xk5_amp0811_edr005_attend_c_full_plus_2psn_nl26_sb_b26r_800k@model.ckpt"