Open greenbech opened 4 years ago
Hi, please try this one, trained for 500,000 iterations on the MAESTRO dataset.
I haven't touched the model in a while, but torch.load('model-500000.pt’)
should be able to load the PyTorch model.
The provided file works great, thanks a lot! I didn't need to use torch.load('model-500000.pt’)
since both evaluate.py
and transcribe.py
has the model file as an argument.
```bash
Traceback (most recent call last):
File "transcribe.py", line 101, in
Downgrading from 1.4.0 to torch==1.2.0
fixed it for me.
It is also quite cumbersome to resample to the audio file to 16kHz before hand, so I added this locally to transcribe.py
:
def float_samples_to_int16(y):
"""Convert floating-point numpy array of audio samples to int16."""
# From https://github.com/tensorflow/magenta/blob/671501934ff6783a7912cc3e0e628fd0ea2dc609/magenta/music/audio_io.py#L48
if not issubclass(y.dtype.type, np.floating):
raise ValueError('input samples not floating-point')
return (y * np.iinfo(np.int16).max).astype(np.int16)
def load_and_process_audio(flac_path, sequence_length, device):
random = np.random.RandomState(seed=42)
audio, sr = librosa.load(flac_path, sr=SAMPLE_RATE)
audio = float_samples_to_int16(audio)
assert sr == SAMPLE_RATE
assert audio.dtype == 'int16'
...
There might be elegant ways of doing this, but I was not able to convert to uint16 with librosa
or resample with soundfile.read
.
I also think the the model you provided should be available in the README for others to try out without going to this issue. I was thinking either directly in ./data/pretrained
since this it the easiest setup but increases the repo size unnecessarily or with the drive url you provided.
Would you mind a PR with this?
Yeah! I'll need some housekeeping to make the checkpoint work cross-version. PR is welcome! Thanks :D
It would be great if anyone could upload a pretrained model so that we could try this model/project without needing to train the model. It is quite a big commitment to wait a week for training (as mentioned in #10 ) if you primarily just want to check out the performance on some
.wav
files.And I would also like to say this repo is very well written and educational. Thanks!