discord-jda / JDA

Java wrapper for the popular chat & VOIP service: Discord https://discord.com
Apache License 2.0
4.34k stars 735 forks source link

Provide access to decrypted AudioPackets, regardless of sequence number #418

Closed mentlerd closed 5 years ago

mentlerd commented 7 years ago

For applications that wish to do per-user voice recording an API like this would be really useful.

The responsibility of having to reassemble the audio stream from out-of-order packets should be placed on the consumers of the interface.

SparklingComet commented 7 years ago

It would be really useful, but recording people without them knowing in a (group)voice chat is illegal in many countries. It should be restricted to the currently logged in user's voice, if possible. (Not sure whether it is)

Almighty-Alpaca commented 7 years ago

People can and will always abuse features. Restricting recording to yourself (which is implementation wise something entirely different from recording others btw) just because someone might use on other people without their knowledge is really bad imo.

SparklingComet commented 7 years ago

Restricting recording to yourself (which is implementation wise something entirely different from recording others btw)

I thought so

Restricting recording to yourself just because someone might use on other people without their knowledge is really bad imo.

True, but we are allowing them to do so by adding this. If I am speaking with someone and I want to record only myself, I can use a mic/recording software. A bot will allow abusing and nothing else.

MinnDevelopment commented 7 years ago

JDA already supports receiving the audio of users when they speak. I don't understand how this discussion is relevant to the thread.

SparklingComet commented 7 years ago

JDA already supports receiving the audio of users when they speak. I don't understand how this discussion is relevant to the thread.

It's relevant to the suggestion - whether or not it makes sense.

MinnDevelopment commented 7 years ago

The suggestion has nothing to do with the fact that you can already implement such a functionality with the current state of JDA. It is requested to give access to unhandled frames to users. You are talking about generally having voice receive support which is already a thing thus discussing whether it should be restricted or not is nonsense.

SparklingComet commented 7 years ago

The suggestion has nothing to do with the fact that you can already implement such a functionality with the current state of JDA. It is requested to give access to unhandled frames to users. You are talking about generally having voice receive support which is already a thing thus discussing whether it should be restricted or not is nonsense.

If it's already a thing, the whole issue can be closed. It's not my reply, but the issue itself that's redundant.

MinnDevelopment commented 7 years ago

No you don't even understand what is requested, stop arguing about something that is completely irrelevant to this thread please.

DV8FromTheWorld commented 7 years ago

@mentlerd I don't see the use case.

Additionally, it really isn't the client's responsibility to decode and ensure packet ordering. That is the entire point of a library. It abstracts that system so that the client doesn't have to care about it. Also, to properly decode audio packets, we have to keep a semblance of order, otherwise OPUS's decoding algorithms get mad and can produce jarring audio artifacts

If you want UserSpecific audio recording (e.g, not the combined audio of all users), JDA already provides that.

mentlerd commented 7 years ago

My use case would be a "rolling window" voice recorder bot, that would allow to capture the last X minutes of conversation, with separate channels for each user.

Since this application is not realtime at all, I could just save the packets in the order I receive them onto the disk, then later reassemble each user's complete stream (accounting for packet order, and discord not sending audio packets for continuous silence), decode OPUS and encode the audio to a different format.

This approach requires other metadata to be saved to the disk, but as you have mentioned before, this application is out of the scope of the library.

I am only asking for an API to access raw AudioPackets, and possibly disable OPUS decoding to save on performance.

Sanduhr32 commented 7 years ago

Discord sends the AudioPackets in the OPUS format so JDA wont or cant change that (i dunno i'm not staff) , and you can access the packets via AudioReceiveHandler using the methods handleUserAudio or handleCombinedAudio. Here is an example

//Be sure you set the AudioReceiveHandler in the (Guild) AudioManager
public void handleCombinedAudio(CombinedAudio combinedAudio) {
    combinedAudio.getAudioData(1.0)); //The 1.0 is the volume which should be a double
    //The Method docs can be found here: http://home.dv8tion.net:8080/job/JDA/lastSuccessfulBuild/javadoc/net/dv8tion/jda/core/audio/AudioReceiveHandler.html#handleCombinedAudio-net.dv8tion.jda.core.audio.CombinedAudio-
    //I recommend to use a volume between 0.0 and 2.0 because it isnt changing the length of the bytes too much and isnt too loud or too quiet
    //You can use handleUserAudio too
    //If you want to create a recordigs bot and only for a few you can use switch/case and their user id's
    //If you want to save the packets i recommend a FileOutputStream to a pcm file
}

I hope that helps you. Regards Sanduhr

PS: I recommend Audacity for converting the pcm into the format of your choice

mentlerd commented 7 years ago

I appreciate your attempt to help, but I am very well aware of the current interface of the audio system. What I am requesting currently cannot be done without the modification of the library core files.

This is not a support/help ticket, and I have done my homework on the subject.

DV8FromTheWorld commented 7 years ago

I appreciate the PR and I will review it after audio-resume, however, based on your use case, I don't see why this can't be implemented with current functionality.

Could you possibly elaborate as to why you cant capture the PCM output via UserAudio and save that to the disk?

DV8FromTheWorld commented 7 years ago

@MinnDevelopment has further elaborated on your use-case in a way that I missed.

You basically need what the AudioSendHandler has, a flag to determine isOpus, but instead of encoding, prevent decoding.

I will review your PR, but most likely AudioReceiveHandler will have a method in which you can decide whether to use opus decoding or not.

DV8FromTheWorld commented 7 years ago

I do have plans for this, namely a boolean hook in AudioReceiveHandler to control decoding, similar to the isOpus hook in AudioSendHandler

mpotthoff commented 5 years ago

I'm actually in need of the same feature. It seems like @MinnDevelopment has already started working on it in feature/opus-interceptor half a year ago. For now, I'm using a rebased version of it and it seems to be working fine so far. Are there any plans to finish it?

MinnDevelopment commented 5 years ago

There are currently no plans to work on voice receive at all due to #904