jamulussoftware / jamulus

Jamulus enables musicians to perform real-time jam sessions over the internet.
https://jamulus.io
Other
997 stars 222 forks source link

Server recording writes full mix instead of individual mix of "recorder" #433

Closed sthenos closed 3 years ago

sthenos commented 4 years ago

Ideally I'd like to be able to save a full mix of all the channels on the server instead of saving a full mix locally from a jamulus client. I know we can re-mix the individually saved channel wav's afterwards, but what I'd really like to be able to do is save a live mix on the server from one of the channels where the channel instrument has been set to 'recorder'.

I would be able to stream this full mix wav file straight to our OBS streaming software for streaming out to our various stream destinations. This would save a lot of routing that I am currently doing by connecting multiple jamulus clients onto rooms.

corrados commented 4 years ago

I would be able to stream this full mix wav file straight to our OBS streaming software for streaming out to our various stream destinations

How do you stream a wav file? Isn't it just a file on the disk?

How would you select the mix to be stored? Maybe it would be easier to store all mixes. That would mean to double the data on the disk. But I guess pljones has to comment on that because the jam recorder is his "baby" ;-).

jp8 commented 4 years ago

I think it would be useful to have a separate recording option. Instead of making a .wav for each client, it would just record a .wav of the individual mix of any client whose instrument type is "recorder".

I didn't try yet but I think I can just use cat or tail to pipe the .wav to ffmpeg for streaming. I'd welcome an rtp client built into to the Jamulus server, though ;)

Could you tell me if the server uses an increased jitter buffer for recording? Is the recording jitter configurable?

WolfganP commented 4 years ago

I didn't try yet but I think I can just use cat or tail to pipe the .wav to ffmpeg for streaming. I'd welcome an rtp client built into to the Jamulus server, though ;)

I was thinking something on the same path, either an RTP redirect or some kind of a Jack interface for a backoffice audio stream at server level, for those wanting to stream or post-process in real-time.

BTW, OPUS supports multichannel up to 255, maybe it can be easily repacked somehow with the already encoded audio?

pljones commented 4 years ago

Could you tell me if the server uses an increased jitter buffer for recording? Is the recording jitter configurable?

The recorder runs as a separate thread.

Each set of frames that the main server process handles for the connections in one "tick" get passed to the recorder at the same time they pass through the server mixer.

What would be needed is another thread, running after the mix, to "record" a nominated connection (probably by a command line option to match against IP or client name and a UI input - ideally on the list of clients). Now, as that thread doesn't exist yet, what it actually does is open for discussion.

Should it write a standard WAV, like the existing recorder or raw, 16 bit fixed-point LPCM data straight to disk (skipping RIFF WAVE headers)? Or should it do something else entirely? I'd question why it would target the network -- the Jamulus client can already do that, so you end up duplicating functionality. Of course, on *nix, it could be writing to a file that's actually a connection to a process streaming to the network, so long as it opens the file in append mode (hence why I ask about file headers).

@sthenos, I think you need to make what you're thinking clearer.

corrados commented 4 years ago

I'd question why it would target the network -- the Jamulus client can already do that, so you end up duplicating functionality.

That is a good point. Let's wait for sthenos's feedback.

pljones commented 4 years ago

Hi @sthenos,

Simon, is this still something that needs working on or can be closed?

Thanks,

-- Peter

sthenos commented 4 years ago

Yes let's close it. Thought it might be useful, but lets close

corrados commented 4 years ago

I would be able to stream this full mix wav file straight to our OBS streaming software for streaming out to our various stream destinations. This would save a lot of routing that I am currently doing by connecting multiple jamulus clients onto rooms.

Before we close this, I am curious how you do your World Jam audio routing. As far as I understand you have a streamer Jamulus client in the main stage and monitor Jamulus clients in the waiting rooms. Let's assume you have only one waiting room. That would mean we have two instances of Jamulus clients. The stream client audio output is routed to OBS and the monitor client at the same time, right? I think you could do this by the Jack audio router. Or do you have a different setup?

sthenos commented 4 years ago

Hi Volker.

Yes that's correct we have one client connected on the main studio and then route, using jack to 4 other clients that are on the two waiting rooms, the back stage and a private broadcast server. The OBS windows cloud instance, connects a jamulus client to that private broadcast server.

Between the main studio client and the 4 output clients we route, via jack, into Carla into 5 plugins: Eq, Compressor, Audio Gain, Reverb and Limiter which normalises (boosts the bass, slightly compresses the signal and limits it to avoid peaks above -3db.

It kind of all works ok. The sound, however, is still not quite as good as the sound on the recorded server. Even if we took out all the Carla plugins from the chain. The jackd, 5 clients and main studio all run on the same physical server, so there is no network in the way. The only time network plays a part is from the OBS studio VM jamulus client onto the broadcast server, but we have buffer sample size set at 1024 and jitter buffer set up to 20/20 so there should not be a single bit of jitter

WolfganP commented 4 years ago

As in the other commented use cases of live streaming, it seems the extra client used for output routing doesn't add any functionality to the setup, and the extra mix and compression may even reduce the sound quality. I still think a direct output of individual channels from the server (like the recording process) will be the best option for post treatment, but I'm not sure which will be the lowest development effort needed (maybe using the current server as is and route the recordings to named pipes and somehow capture those in a DAW as multichannel?)

sthenos commented 4 years ago

Actually without the second client, we can't send the audio into OBS on the window server.

See below architecture diagram on the flow image

corrados commented 4 years ago

Thanks for the picture. The audio path from the instrument to YouTube goes through four OPUS encoder/decoders and four jitter buffers. That explains the bad audio quality. I think you have to get rid of the Jamulus Server Broadcast if possible.

WolfganP commented 4 years ago

@sthenos I agree with @corrados, you have too many en/decoders and buffers in the audio chain from source (instrument) to publishing via OBS. Maybe using a concept of audio matrix (ie Catia patchbay) to route all rooms in there, and switch the output from one of them to the publishing stream will be a better solution? (kind of rotating stages where you just expose one to the public, and the rest prepare in the other backstages)

BTW, in exploring ideas to attach the server directly to a DAW for advanced streaming or recording use cases, I was recently researching the ReaStream protocol and seems simple enough (UDP based, simple format). I'll see if I can find or quickly implement a test pgm to validate the protocol. https://github.com/niusounds/dart_reastream/blob/master/reastream_spec.txt & https://forum.cockos.com/showpost.php?p=1025391&postcount=4

corrados commented 4 years ago

Either I would run the Jamulus Client Main Studio on the Windows OBS PC, or you should replace the connection Carla->OBS by some other tools since you do not need a low latency connection via Jamulus there. You could use VLC for this connection and use an uncompressed stream.

sthenos commented 4 years ago

Yes that would explain the degradation of audio client with too many en/decodes happening along the chain.

However, unfortunately OBS/Zoom does not work on a headless linux with no gpu, so it has to be windows and I detest using windows for spinning up jamulus servers and the management of them. So for the moment we have to use two seperate servers.

But one thing Jono and I have been talking about is whether the jamulus server could expose a jackd interface to allow servers to be connected directly to each other. But upon more thought I think this could be tricky.

I think a good solution would be allow the server to make a connection to another server and send unencoded audio stream to it, if a streamer/recorder client connects to it.

For instance. Take this user story:

$ Jamulus -s --routeaudio "localhost:22125;Bob" --serverinfo "Alice; london; 224" -p 22124 $ Jamulus -s --acceptaudio "localhost; 10.10.10.10" --serverinfo "Alex; london; 224" -p 22125

  1. This would launch a server called "Alice". Initially on startup it does nothing.
  2. This would launch a server called "Alex". Initially on startup it does nothing.
  3. User "peter" (guitarist) connects to server "Alice" and starts sending/receiving audio. Server performs as it normally does.
  4. User "john" (streamer) connects to server "Alice" and sets a fader level of peter to 50%. Server makes a connection to localhost:22125 and sends all audio that it would be sending back to "john" unencoded, direct to server "Alex".
  5. Server Alex, receives connection from localhost as it's on the allowed list (it would also allow a connection from 10.10.10.10 too. Server Alex, creates a channel for "Bob" as that is the name designated in the --routeaudio argument.
  6. Now anyone that connects to server "Alex" will see "Bob" as a fader and can mix it how they want.

Thoughts?

danryu commented 4 years ago

For my 2 cents, as a backend developer looking to make solutions with Jamulus, the suggestions made here by @pljones sound great...

This latter option would open up a lot of possibilities in terms of integrations with other systems. Count me +1 interested.

What would be needed is another thread, running after the mix, to "record" a nominated connection (probably by a command line option to match against IP or client name and a UI input - ideally on the list of clients). Now, as that thread doesn't exist yet, what it actually does is open for discussion.

Should it write a standard WAV, like the existing recorder or raw, 16 bit fixed-point LPCM data straight to disk (skipping RIFF WAVE headers)? Or should it do something else entirely? I'd question why it would target the network -- the Jamulus client can already do that, so you end up duplicating functionality. Of course, on *nix, it could be writing to a file that's actually a connection to a process streaming to the network, so long as it opens the file in append mode (hence why I ask about file headers).

dingodoppelt commented 4 years ago

Of course, on *nix, it could be writing to a file that's actually a connection to a process streaming to the network, so long as it opens the file in append mode (hence why I ask about file headers).

@pljones: I've imagined exactly that. We are currently streaming a server's output to icecast with an additional jamulus client running on the same machine (so jackd and its dependencies, icecast and its jack client are running alongside the jamulus server which isn't nice at all). It would be way easier if the server put out a full mix into a file so we save on CPU load and don't need to introduce another roundtrip of opus decoding and encoding.

pljones commented 4 years ago

Each client has their own "full mix". You'd effectively be running an extra client on the server and capturing it's output in a file. Why not do that?

dingodoppelt commented 4 years ago

Each client has their own "full mix". You'd effectively be running an extra client on the server and capturing it's output in a file. Why not do that?

We are doing that right now and found that to be an unconvenient way of doing things since the other software needed isn't failsafe either (we're having issues with levels and cpu usage because the affordable servers in the cloud don't pack that much of a punch to run all that simultaneously and I'd rather have more slots on my jamulus server than compromise for performance). Furthermore it involves installing dependencies which make the whole point of the "nosound" configuration useless since you have to install them anyways. This feature would complement the jamrecorder nicely since it covers the other use case of a server side output (streaming in almost realtime). I think jamulus would benefit from such a feature because streaming jamulus livejams to an audience has proven to be a thing now :) Oh, and we save on another roundtrip of decoding and encoding in the streaming client.

pljones commented 4 years ago

The best option seems to me to be a separate tool to take the recorder files as they're written and allow them to be mixed and streamed.

dingodoppelt commented 4 years ago

The best option seems to me to be a separate tool to take the recorder files as they're written and allow them to be mixed and streamed.

But this would still need extra software to double functionality from the server. The "full mix" is already there and the possibility to spit out WAV files is already there so I thought the most elegant way to manage that was to be using the server since it is running anyway and mixing and recording happily, just not into a stereo file.

pljones commented 4 years ago

Look, if the functionality is there, go write the code.

It's not.

corrados commented 4 years ago

I just did a "quick hack" test and moved the emit AudioFrame to another place: server.zip I just tested it with one client and, at least for that simple test, I see in the wave file the expected behaviour: grafik At the red marks I changed the pan and at the black marks I changed the volume in my clients mixer panel.

dingodoppelt commented 4 years ago

I just did a "quick hack" test and moved the emit AudioFrame to another place:

Good one, Volker! I just tested with two clients and it works as expected. That's what I meant with the server is "mixing and recording happily" ;)

pljones commented 4 years ago

OK, so that complete breaks the existing functionality.

What's needed is a new signal and a new slot in a new recorder (the existing naming conventions would break, the existing RPP builder would then likely also break). And the signal needs to be raised once, not once per connected client as the existing signal. And the AudioFrame needs to contain..? Um, something relevant to ..?

corrados commented 4 years ago

I did not say that my code is productive code. It was just a "quick hack" :-).

At least my code change is trivial and could be activated by a simple additional command line argument. What I think it does is that instead of recording the individual input signal is recorded, the individual mix which is sent back to that client is recorded. If dingodoppelt is happy with that, we do not need to change much. But I do not understand all the details of the recorder so pljones, you are the expert and have to decide how to proceed here.

pljones commented 4 years ago

I did not say that my code is productive code. It was just a "quick hack" :-).

Sorry, I didn't meant to direct this at you -- more towards the reaction.

I may also have overreacted - my code might actually handle the change almost sensibly, depending where the emit is now called from. The one thing that would be wrong would be the filename, which currently identifies the channel the frame came from, including the client details (IP, port, musician name). It could, I think, potentially cause a crash, depending on how it's been done (i.e. channel 0 is now unoccupied but an AudioFrame arrives).

Again, this is without seeing the code change. I'm just urging caution before anyone jumps to the conclusion that "it's done".

Also, as I say, I don't think you could get the existing code to work "both ways". It would need a separate signal and slot. Some of the existing code could be reused, yes (it's a WAV file writer...). Indeed, the RPP and LOF wouldn't be needed, so it needn't even go into the main JamRecorder.

dingodoppelt commented 3 years ago

Hi everybody, I took another shot at making the server stream audio directly to ffmpeg. It is more a proof of concept since I still don't know what I'm doing but it does what it is supposed to do and doesn't break existing code (I hope). We currently stream the old fashioned way, which involves running a jack server, a jamulus client and icecast2 as a streaming service all on the same machine, which is not optimal for cheap virtual servers. We found also that listeners join servers and block slots for people actually wanting to play. We kindly ask those listeners to listen on the stream and we have regular listeners (sometimes almost as many as there are slots on the server) on the streams so this feature is really needed by now since more people started using and getting to know jamulus. I could even imagine the JamulusExplorer to feature a "play button" so people can listen in to servers that support streaming, but that's just an idea and maybe a good opportunity to make people aware of the possibility to play online. What is still necessary to implement is a clear indication, that the server's sound is being transmit. I think the recorder uses a protocol message for the client to display the "recording active" message for that purpose. If you want to have a look at what I did, here you go: https://github.com/dingodoppelt/jamulus/tree/streamer2

cheers, nils

danryu commented 3 years ago

Thanks for the update @dingodoppelt.

Do your changes build on Volker's earlier hacks from Sep 2020? What do they add, in particular? I notice you do an external invocation of ffmpeg. It would be great if you give an overview of how your changes operate, ie cmdline changes, example execution scenario and so forth.

dingodoppelt commented 3 years ago

@danryu : After Volker's hack I decided to create a new signal. The mix is being created in its own function (basically a virtual client that just adds up all channels without taking care of individual gain settings to save on processing time as jamulus already does this) ffmpeg is called in its own thread and is receiving server stopped and started signals (a signal handler would be another thing to think about to stop and start streaming on the fly). You can start the server with: --streamto "-f mp3 icecast://source:icecast2_source_pw>@<icecast2_ip:/" You can stream to every output format that ffmpeg offers. For reference consult the ffmpeg manual on output options. (file, transcoding and basically everything else ffmpeg offers).

danryu commented 3 years ago

@dingodoppelt That sounds good. So ffmpeg runs in its own thread but is still manageable via the signals, which is handy. About your notes on the mix - so is the resultant mix a "flat" automix of all channels - or otherwise who/which client determines the mix to be streamed? Sorry, I'm not familiar enough with the codebase and changes yet. Otherwise it looks great and I'll start experimenting with this soon....

dingodoppelt commented 3 years ago

@danryu it is, as you say, a "flat" automix since the code should be running unattended. one thing i noticed is that it is really sensitive to clipping. (level discipline still isn't everybody's virtue ;) i could have changed that already but i wanted the code to be as lean on cpu usage as possible. to be able to mix a stream i still recommend "the old fashioned way" by connecting a designated streaming client (preferably with a builtin compressor as coded by hselasky a few months back) though this functionality might find its way into my code if it is really necessary.

danryu commented 3 years ago

Ok thanks for clarifying @dingodoppelt. Yes, as discussed earlier in the thread, finding a way to use a specific mix and avoid the re-encoding and decoding process of connecting another client - that would be perfect. At least we have a proof-of-concept solution for automix streaming now. Being able to set levels on that mix (also via signals perhaps? :) would be the icing on the cake :)

pljones commented 3 years ago

Hi @dingodoppelt / nils, thanks for taking this on!

Is there any chance you can rebase on current master and raise this as a pull request? It helps to see what the change would be.

Does this allow both recording and transmission at the same time?

dingodoppelt commented 3 years ago

@pljones Hi Peter, I just raised the PR.

Does this allow both recording and transmission at the same time?

It does! The jamrecorder I left untouched and it should work as before. ffmpeg does offer sending the input stream to multiple destinations, though I haven't tried it yet.

pljones commented 3 years ago

It failed to build, unfortunately. I'm hoping you're looking into it :).

pljones commented 3 years ago

It does! The jamrecorder I left untouched and it should work as before. ffmpeg does offer sending the input stream to multiple destinations, though I haven't tried it yet.

I mean the existing server recorder. Does that still work along side this feature? It won't be accepted unless it does. You'd need to test out having the server streaming and doing server-side recording at the same time.

dingodoppelt commented 3 years ago

@pljones yes, as i said, i didn't touch the jamrecorder at all and kept my code completely separate. it streams while the recorder is running. i could send signals to the jamrecorder and stop and start the recordings with signals.

after the rebase my code wasn't compiling anymore. it does now :) still it needs ffmpeg as a new dependency

EDIT: since i'm using unix system calls (popen, pclose) it won't compile on windows. I have no idea on how to fix this. is it even possible? (i'm not a programmer)

EDIT2: i can't see why my PR includes some of the language files. I certainly have not touched them. did i do something wrong on the rebase?

pljones commented 3 years ago

If the code is to be accepted and only compiles on unix-like OSes, it will need to have the feature conditionally compile. There's quite a lot of places that check whether or not something is on WIN32 or not.

Does it introduce ffmpeg as a build time dependency? If so, then this feature would need to be made conditional (similar to the nosound or headless configs).

And yes, I'd guess the rebase went horribly wrong. Rule of rebase: always check only the files you expect to change have changed and only have the expected changes in... Undoing is hard. You may still have the last "working" commit locally, though, if you have a dig around, and you'll be able to check that out. So long as you can get the patch against it's history back to master into a file, you may be able to then clean up locally.

hoffie commented 3 years ago

i can't see why my PR includes some of the language files. I certainly have not touched them

Haven't been able to reproduce at will, but I've also had changed language files after building. If someone can reproduce, this might help in tracking this down.

Does it introduce ffmpeg as a build time dependency?

As far as I can see, no libav* is used directly, so it should be a runtime dependency. ffmpeg is invoked as a binary.

You may still have the last "working" commit locally, though, if you have a dig around, and you'll be able to check that out.

git reflog can help there.

dingodoppelt commented 3 years ago

thanks for the advice! another problem occured though: when launching the server it doesn't recognize the commandline argument and breaks with EDIT: stupid me... ;)

pljones commented 3 years ago

Just referencing #967 -- that is this, right?

gilgongo commented 3 years ago

I think we can close this now on the grounds that you can mix down the files supplied if you want the mix; else use the new streamer if you want a stream.