BShakhovsky / PolyphonicPianoTranscription

Recurrent Neural Network for generating piano MIDI-files from audio (MP3, WAV, etc.)
https://magenta.tensorflow.org/onsets-frames
230 stars 41 forks source link

tensorflow compat #3

Closed emusiceducation123 closed 2 years ago

emusiceducation123 commented 4 years ago

Hi I couldn't find your email, so I opened this ticket, I'm hoping you won't mind. I'm new at tensorflow and very interested in piano transcription topic. I'm working on a game that uses tensorflow lite but the model provided by magenta team has dynamic input size and incompatible with GPU delegate/NNAPI delegate. I stumbled onto your project and saw that you had quite some experience in this domain, I'm wondering if you could help me. When I run your notebook, I hit many compatibility issue since my computer has tensorflow v2, and it doesn't have tensorflow.Session e.g. And also, I'm running into many errors regarding missing *.npy files, I'm not sure what they are. Is there a way for you to provide a link to the onset_frames model in SavedModel file format? I tried to load their estimator but couldn't figure out how to save it to SavedModel file so that I can convert to tensorflow lite myself. Of course, there are other issues like compatible ops etc.

I really appreciate your help!

BShakhovsky commented 4 years ago

Hello,

Yes, I just tried, and I also got many errors for tensorflow V2. So, the easiest solution was just to downgrade tensorflow to version 1.15, re-run my notebook ("3 Magenta to Keras.ipynb"), and save Keras-models to HDF5-format again. Then, I re-opened them and saved to tensorflow SavedModel format using the following code:

import tensorflow as tf

onsetsModel = tf.keras.models.load_model('Keras Onsets.hdf5', compile=False)
tf.saved_model.save(onsetsModel, 'Tensorflow Onsets')
# The same for Offsets, Frames and Velocities.

I have just uploaded the models to my latest release, Keras-models here: https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Keras_HDF5.zip and tensorflow SavedModel files here: https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Tensorflow_SavedModel.zip

I never worked with SavedModel format before, and I hope that these files are what you need and that you will be able to use them.

However, my models slightly differ from Magenta. Their model is too big for my GPU, so I had to split it into four separate sub-models (onsets, offsets, frames (actives), volumes (velocities)). First, onsets and offsets are predicted from spectrogram. Then they are fed as input (plus spectrogram again) to frames model. Velocities are up to you, you do not have to predict them if you do not want to.

emusiceducation123 commented 4 years ago

Thank you so much! I’ll try those. Sorry for the very newbie question but what’s spectrogram and how do I calculate it? Related to fft?

On Tue, Sep 15, 2020 at 7:08 PM Boris Shakhovsky notifications@github.com wrote:

Hello,

Yes, I just tried, and I also got many errors for tensorflow V2. So, the easiest solution was just to downgrade tensorflow to version 1.15, re-run my notebook ("3 Magenta to Keras.ipynb"), and save Keras-models to HDF5-format again. Then, I re-opened them and saved to tensorflow SavedModel format using the following code:

import tensorflow as tf

onsetsModel = tf.keras.models.load_model('Keras Onsets.hdf5', compile=False) tf.saved_model.save(onsetsModel, 'Tensorflow Onsets')

The same for Offsets, Frames and Velocities.

I have just uploaded the models to my latest release, Keras-models here:

https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Keras_HDF5.zip and tensorflow SavedModel files here:

https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Tensorflow_SavedModel.zip

I never worked with SavedModel format before, and I hope that these files are what you need and that you will be able to use them.

However, my models slightly differ from Magenta. Their model is too big for my GPU, so I had to split it into four separate sub-models (onsets, offsets, frames (actives), volumes (velocities)). First, onsets and offsets are predicted from spectrogram. Then they are fed as input (plus spectrogram again) to frames model. Velocities are up to you, you do not have to predict them if you do not want to.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BShakhovsky/PolyphonicPianoTranscription/issues/3#issuecomment-693093798, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQYFWQFH55V2AJOIHKKKNNDSF76WHANCNFSM4RKYEVQA .

BShakhovsky commented 4 years ago

Yes, related to fft, but in case of Onsets and Frames model, it takes mel-scaled spectrogram as input (with logarithmically-spaced frequency bins). In Python it is calculated very easy using "librosa" library. You can see an example in my notebook: https://github.com/BShakhovsky/PolyphonicPianoTranscription/blob/master/4%20Piano%20Audio%20to%20Midi.ipynb

If your game is in another programming language, then it is going to be more difficult, but there may also be a suitable library for mel-spectrogram calculation.

emusiceducation123 commented 4 years ago

Thanks a lot!

On Tue, Sep 15, 2020 at 8:33 PM Boris Shakhovsky notifications@github.com wrote:

Yes, related to fft, but in case of Onsets and Frames model, it takes mel-scaled spectrogram as input (with logarithmically-spaced frequency bins). In Python it is calculated very easy using "librosa" library. You can see an example in my notebook:

https://github.com/BShakhovsky/PolyphonicPianoTranscription/blob/master/4%20Piano%20Audio%20to%20Midi.ipynb

If your game is in another programming language, then it is going to be more difficult, but there may also be a suitable library for mel-spectrogram calculation.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BShakhovsky/PolyphonicPianoTranscription/issues/3#issuecomment-693116936, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQYFWQB27W4QOETW5TA4JQ3SGAIXVANCNFSM4RKYEVQA .

emusiceducation123 commented 4 years ago

Hi Boris I've successfully converted the models so that they can be loaded in tensor flow lite on iOS. I ran into some errors when I turn on GPU delegate, complaining PACK, UNPACK, etc. I suppose I'll need to change the original layers to equivalent structures that use the ops supported by GPU delegate. Did you have to do something similar? If you could point me to the right direction, I'd really appreciate it.

Thanks! Ming

On Tue, Sep 15, 2020, 7:44 PM Lingzhi Cao emusiceducation123@gmail.com wrote:

Thank you so much! I’ll try those. Sorry for the very newbie question but what’s spectrogram and how do I calculate it? Related to fft?

On Tue, Sep 15, 2020 at 7:08 PM Boris Shakhovsky notifications@github.com wrote:

Hello,

Yes, I just tried, and I also got many errors for tensorflow V2. So, the easiest solution was just to downgrade tensorflow to version 1.15, re-run my notebook ("3 Magenta to Keras.ipynb"), and save Keras-models to HDF5-format again. Then, I re-opened them and saved to tensorflow SavedModel format using the following code:

import tensorflow as tf

onsetsModel = tf.keras.models.load_model('Keras Onsets.hdf5', compile=False) tf.saved_model.save(onsetsModel, 'Tensorflow Onsets')

The same for Offsets, Frames and Velocities.

I have just uploaded the models to my latest release, Keras-models here:

https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Keras_HDF5.zip and tensorflow SavedModel files here:

https://github.com/BShakhovsky/PolyphonicPianoTranscription/releases/download/2019-06-22/Tensorflow_SavedModel.zip

I never worked with SavedModel format before, and I hope that these files are what you need and that you will be able to use them.

However, my models slightly differ from Magenta. Their model is too big for my GPU, so I had to split it into four separate sub-models (onsets, offsets, frames (actives), volumes (velocities)). First, onsets and offsets are predicted from spectrogram. Then they are fed as input (plus spectrogram again) to frames model. Velocities are up to you, you do not have to predict them if you do not want to.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BShakhovsky/PolyphonicPianoTranscription/issues/3#issuecomment-693093798, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQYFWQFH55V2AJOIHKKKNNDSF76WHANCNFSM4RKYEVQA .

BShakhovsky commented 4 years ago

Hi Ming,

Sorry, unfortunately I do not have any experience with TFlite or running models on mobile devices.

BShakhovsky commented 3 years ago

Hello Ming,

In case if your question is still relevant. I tried to run Magenta's TF-lite model on Android, and I also got the same errors with GPU delegate. I do not know what to do with it, so, I just do not add GPU delegate. But the model is still running super-fast both on my Android device and on emulators.

Glooring commented 3 years ago

Hi Boris,

I tried the code from 2. Training, Validation, Testing with different versions of python and at certain steps I come across some errors, and some libraries do not import. I would want to ask you, what version of python did you use?

Thanks, Dan

BShakhovsky commented 3 years ago

Hello Dan,

It was Python 3.6.8 at that moment, and I do not remember the Tensorflow version. Currently, I have Python 3.8.8 and Tensorflow 2.4.0. I uploaded slightly corrected version of the template just now which runs fine on my machine.

I also often have errors about missing libraries, usually after reinstalling Anaconda, or updating some libraries, especially after updating Tensorflow. In that case I just Google for the answer, install/reinstall/update the libraries, and after some trial and errors I manage to make it work.

Glooring commented 3 years ago

Thank you very much!