openvpi / SOME

SOME: Singing-Oriented MIDI Extractor.
MIT License
392 stars 37 forks source link

Can this get the midi duration and sequence ? #2

Closed francqz31 closed 11 months ago

francqz31 commented 11 months ago

@yqzhishen Hello it is me again I just noticed this project , like we discussed before, can this be used to get the " MIDI sequence | MIDI duration sequence" especially for English ?

like the one in opencpop as you said before "filename | lyrics | phoneme sequence | MIDI sequence | MIDI duration sequence | phoneme duration sequence | is slur sequence"

I'm not really looking for something to get me the "phoneme sequence | or the phoneme duration sequence or is slur" right now, just the Midi sequence and Midi duration sequence accurately that's what I need! so can SOME do that ?

Thanks in advance!

https://github.com/openvpi/DiffSinger/issues/29

yqzhishen commented 11 months ago

Yes SOME is language-independent, but you still need .ds + .wav to train

francqz31 commented 11 months ago

@yqzhishen so SOME doesn't bring the Midi duration sequence ? just Midi sequence ?

yqzhishen commented 11 months ago

At inference time SOME produces 3 outputs:

francqz31 commented 11 months ago

Ok Fair enough Thanks again , I will reopen the issue if the training scripts get released and I had issue with it. I will be waiting since I want to train in English.

francqz31 commented 11 months ago

Hello again Mr. yqzhishen I did run python infer.py --model CKPT_PATH --wav WAV_PATH and I got this out: "accumulate_grad_batches: 1, audio_sample_rate: 44100, binarization_args: {'num_workers': 0, 'shuffle': True}, binarizer_cls: preprocessing.MIDIExtractionBinarizer, binary_data_dir: data/some_ds_fixmel_spk3_aug8/binary, clip_grad_norm: 1, dataloader_prefetch_factor: 2, ddp_backend: nccl, ds_workers: 4, finetune_ckpt_path: None, finetune_enabled: False, finetune_ignored_params: [], finetune_strict_shapes: True, fmax: 8000, fmin: 40, freezing_enabled: False, frozen_params: [], hop_size: 512, log_interval: 100, lr_scheduler_args: {'min_lr': 1e-05, 'scheduler_cls': 'lr_scheduler.scheduler.WarmupLR', 'warmup_steps': 5000}, max_batch_frames: 80000, max_batch_size: 8, max_updates: 10000000, max_val_batch_frames: 10000, max_val_batch_size: 1, midi_extractor_args: {'attention_drop': 0.1, 'attention_heads': 8, 'attention_heads_dim': 64, 'conv_drop': 0.1, 'dim': 512, 'ffn_latent_drop': 0.1, 'ffn_out_drop': 0.1, 'kernel_size': 31, 'lay': 8, 'use_lay_skip': True}, midi_max: 128, midi_min: 0, midi_num_bins: 256, midi_prob_deviation: 0.5, midi_shift_proportion: 0.0, midi_shift_range: [-6, 6], model_cls: modules.model.Gmidi_conform.midi_conforms, num_ckpt_keep: 5, num_sanity_val_steps: 1, num_valid_plots: 300, optimizer_args: {'beta1': 0.9, 'beta2': 0.98, 'lr': 0.0001, 'optimizer_cls': 'torch.optim.AdamW', 'weight_decay': 0}, pe: rmvpe, pe_ckpt: pretrained/rmvpe/model.pt, permanent_ckpt_interval: 40000, permanent_ckpt_start: 200000, pl_trainer_accelerator: auto, pl_trainer_devices: auto, pl_trainer_num_nodes: 1, pl_trainer_precision: 32-true, pl_trainer_strategy: auto, raw_data_dir: [], rest_threshold: 0.1, sampler_frame_count_grid: 6, seed: 114514, sort_by_len: True, task_cls: training.MIDIExtractionTask, test_prefixes: None, train_set_name: train, units_dim: 80, units_encoder: mel, units_encoder_ckpt: pretrained/contentvec/checkpoint_best_legacy_500.pt, use_buond_loss: True, use_midi_loss: True, val_check_interval: 4000, valid_set_name: valid, win_size: 2048 | load 'model' from '/content/SOME/model_steps_64000_simplified.ckpt'. 100% 1/1 [00:01<00:00, 1.84s/it] MIDI file saved at: '/content/SOME/202.mid'

** it converted the singing wav file into midi , Now how can I get the MIDI sequence , MIDI duration sequence of this midi file? what should I do ?

Thanks in advance!

yqzhishen commented 11 months ago

You can use any editors or packages that support importing/extracting MIDI file format. But if you are able to read the code, you can get the raw outputs before the MIDI file is saved in infer.py

francqz31 commented 11 months ago

well the thing that unfortunately i have no idea how to do these 2 things , how can i get the raw outputs before the MIDI file is saved?

yqzhishen commented 11 months ago

What are you using the MIDI file for?

Here midis is the raw outputs.

https://github.com/openvpi/SOME/blob/e0ca1ed9b2e71bfeb2176bc51f9fa0469f3ea0de/infer.py#L37

francqz31 commented 11 months ago

I'm using the midi file to have a dataset like opencpop but in English, I already have a way to get the phoneme sequence and duration , and now I'm looking to get the Midi sequence and duration from SOME

yqzhishen commented 11 months ago

some_batch_infer.zip

Maybe this script can help, but it is not well-documented. You need to put it in your SOME directory, edit the parameters and options in the file, and run

francqz31 commented 11 months ago

Oh thanks so much , I edited the parameters and all: input_csv, out_csv, wav_folder , model_path and I got this csv_datas:1 success: 0 my result.csv looks just like my transcriptions.csv , (I put the English transcription of my wav file in transcriptions.csv)