elgiano / nn.ar

nn_tilde adaptation for SuperCollider
GNU General Public License v3.0
30 stars 4 forks source link

Missing documentation on usgae #7

Open trackme518 opened 9 months ago

trackme518 commented 9 months ago

Hi, I installed the extension in super-collider and it works. But it is unclear how your code works. Please add extensive description to your example. It is loading the rave model. Than I need to export my own msprior.ts based on the rave model and its training data I am using? Or should that be downloaded from somewhere? I would appreciate already exported build or links to working set of models to at least test it.

How does it function? It feeds noise into rave model which will do what? Generate audio? Why there is msprior.ts? Please explain. Thank you.

elgiano commented 9 months ago

Hi @trackme518 Thanks for reaching out, from your comment I understand that docs can look too cryptic, as they assume a certain workflow without explaining it in detail. I'm looking forward to add more descriptive text in the next few months. Could you help me understand what exactly were your difficulties?

Also, if anyone else is reading this issues and have similar concerns, they are welcome to add them to the discussion here.

I think, and other people using this extension confirmed it, that the usage is quite clear for people who have used RAVE and nn_tilde before. But your comment is an occasion to make it clearer for people who perhaps didn't. What is your previous experience with RAVE? Or is your first experience with it happening through nn.ar? And would the following kind of information help you?

  1. nn.ar is an interface to load trained RAVE models, and to use in real-time their processing methods. For more informations about RAVE, please refer to its own documentation at https://github.com/acids-ircam/RAVE
  2. nn.ar doesn't include any pre-trained model, but some are shared by the RAVE project itself at https://github.com/acids-ircam/RAVE?tab=readme-ov-file#pretrained-models
  3. RAVE models typically offer three processing methods: encoding (i.e. generating a latent representation of the audio you input), decoding (i.e. the inverse, generating audio from a latent representation you input), and forwarding (generating audio from audio, without providing access to its latent representation). [here I could put examples for each method, a bit like RAVE does, so one example for reconstruction, one example for latent manipulation, one example for playing a RAVE decoder like a synth with obscure parameters]
  4. Some rave models include a "prior" method, which achieves unconditional generation. It takes only one input, the "temperature", and generates a stream of sound. [here I could put a rave v1 prior example]
  5. Other RAVE models don't include a prior, and nn.ar also works with generative models produced by the msprior library. msprior is still experimental, and msprior models can have different generative methods, for which refer to its own documentation. [msprior usage example]
trackme518 commented 9 months ago

Hi, thanks for reaching out - yes I did not use RAVE before. I am interested in direct latent space traversing. My idea was to generate N-dimensional noise seeds that I would feed to the network. My confusion stems from the example that works with 2 models - msprior.ts and ~/rave/ravemodel.ts, why? Do I need 2 models? Or is the example actually multiple examples merged into one block of code? Sorry I don't get it.

elgiano commented 9 months ago

I see your point, I will update the docs. For now you can ignore msprior, you only need one model.

I'll write an example for your use case as soon as I can :)

On Fri, Feb 9, 2024, 17:28 Vojtěch Leischner @.***> wrote:

Hi, thanks for reaching out - yes I did not use RAVE before. I am interested in direct latent space traversing. My idea was to generate N-dimensional noise seeds that I would feed to the network. My confusion stems from the example that works with 2 models - msprior.ts and ~/rave/ravemodel.ts, why? Do I need 2 models? Or is the example actually multiple examples merged into one block of code? Sorry I don't get it.

— Reply to this email directly, view it on GitHub https://github.com/elgiano/nn.ar/issues/7#issuecomment-1936227100, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACNMWT2QKT3NS3J3T7XIYLYSZFCPAVCNFSM6AAAAABCWYMK4SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGIZDOMJQGA . You are receiving this because you commented.Message ID: <elgiano/nn. @.***>

elgiano commented 6 months ago

@trackme518 I'm sorry it took so long, but I've updated documentation on v0.0.4-alpha. If you're still playing with this, I would love to hear if docs are more helpful now.