Neutone / neutone_sdk

Join the community on Discord for more discussions around Neutone! https://discord.gg/VHSMzb8Wqp
GNU Lesser General Public License v2.1
465 stars 21 forks source link

[MODEL] RAVE.sakamoto #32

Closed scottyeung closed 1 year ago

scottyeung commented 1 year ago

A brief description of what your model does

The dataset of the model trained upon two sets of live performance from Ryuichi Sakamoto and Alva Noto. Both sets have similar tracks and timbre qualities, focused on the variants of the compositions in this dataset, rather two sets of completely different data.

  1. Alva Noto + Ryuichi Sakamoto – Insen - 2016
  2. Alva Noto & Ryuichi Sakamoto - Two (Live at Sydney Opera House) - 2019
train set: 7414 examples
val set: 152 examples
selected gpu: [0]
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

0 | pqmf                     | CachedPQMF            | 16.7 K
1 | encoder                  | WasserteinEncoder     | 15.0 M
2 | decoder                  | GeneratorV2           | 15.6 M
3 | discriminator            | CombineDiscriminators | 27.1 M
4 | audio_distance           | AudioDistanceV1       | 0     
5 | multiband_audio_distance | AudioDistanceV1       | 0     

57.7 M    Trainable params
0         Non-trainable params
57.7 M    Total params
230.605   Total estimated model params size (MB)

Checklist

Extra information

Trained on official RAVEv2 Google Colab notebook, then modified to export with Neutone wrapper.

Metadata

The model export function should dump a json file. Please paste the contents here for review and discussions.

{ "model_name": "RAVE.sakamoto", "model_authors": [ "Scott Young" ], "model_version": "1.0.0", "model_short_description": "RAVE model trained on Sakomoto/Alva Noto performances.", "model_long_description": "RAVE timbre transfer model trained on piano and glitch.", "technical_description": "RAVE model proposed by Caillon, Antoine et al.", "technical_links": { "Paper": "https://arxiv.org/abs/2111.05011", "Code": "https://github.com/acids-ircam/RAVE" }, "tags": [ "timbre transfer", "RAVE" ], "citation": "Caillon, A., & Esling, P. (2021). RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv preprint arXiv:2111.05011.", "is_experimental": true, "neutone_parameters": { "p1": { "name": "Chaos", "description": "Magnitude of latent noise", "type": "knob", "used": "True", "default_value": "0.0" }, "p2": { "name": "Z edit index", "description": "Index of latent dimension to edit", "type": "knob", "used": "True", "default_value": "0.0" }, "p3": { "name": "Z scale", "description": "Scale of latent variable", "type": "knob", "used": "True", "default_value": "0.5" }, "p4": { "name": "Z offset", "description": "Offset of latent variable", "type": "knob", "used": "True", "default_value": "0.5" } }, "wet_default_value": 1.0, "dry_default_value": 0.0, "input_gain_default_value": 0.5, "output_gain_default_value": 0.5, "is_input_mono": true, "is_output_mono": true, "model_type": "mono-mono", "native_sample_rates": [ 44100 ], "native_buffer_sizes": [ 2048 ], "look_behind_samples": 0, "sdk_version": "1.1.3", "pytorch_version": "1.13.1+cu117", "date_created": 1682040929.641104 }

bogdanteleaga commented 1 year ago

Hello Scott!

Thank you for your submission, we are happy to see people are training RAVE models.

We discussed this internally and we decided against publishing this model on the browser given the nature of the data that was used for training. For models trained on a specific artist's data we would like to only publish models that are contributed by the artists themselves, or with the artist's approval. Unfortunately, due to Ryuichi Sakamoto recently passing away this adds another layer of complexity to the submission, where in this context we don't believe it would be a good idea to publish this model even if granted approval.

I will close the issue for now, but if you do manage to get approval in some form and you strongly feel that this model would be a good fit for our default collection, please let us know why and we will happily reconsider.