ryanheise / just_audio

Audio Player
1.03k stars 654 forks source link

Documentation for playing from a byte stream with unkown length. #1135

Open PeperMarkreel opened 9 months ago

PeperMarkreel commented 9 months ago

To which pages does your suggestion apply?

- Direct URL 1 - Direct URL 2

Quote the sentences(s) from the documentation to be improved (if any)

@override Future request([int? start, int? end]) async { start ??= 0; end ??= bytes.length; return StreamAudioResponse( sourceLength: bytes.length, contentLength: end - start, offset: start, stream: Stream.value(bytes.sublist(start, end)), contentType: 'audio/mpeg', ); }

Describe your suggestion

The stream in the example is of known length. When the stream has not yet ended, there is no way of knowing the length of the stream. That's a pity because you'll need to wait for the stream to end and introduce an unnecessary delay to the end user.

I have tried a lot of different approaches to play from a direct stream, and for Android, it's not a big problem, but the iOS client is very picky and I could not get it to work even after days and days of trial and error.

So could you please provide an example that works for both android and iOS where you can stream from a source with unknown length and that the audioplayer continues playing the rest of the stream with repeated calls to request after the first bytes are streamed?

Thank you.

Also, in a lot of cases, the first request is done with start 0 end 2 (2 bytes?) and all later requests are done with start null and end null. Is that the intended behavior?

ryanheise commented 9 months ago

Also, in a lot of cases, the first request is done with start 0 end 2 (2 bytes?) and all later requests are done with start null and end null. Is that the intended behavior?

First, it's important to understand that by using this API, you are effectively building your own web server with the sole purpose of taking HTTP requests for audio resources, and responding with the correct HTTP response. In this client-server scenario, your code is the server, and you have no control over the client. If it sends you a request starting at 0 and ending at 2, then so be it - that is what the client wants to do. The intention is not my own, but of whoever is making the requests (in your case, it is the iOS audio player.

What I would therefore suggest is that you read online tutorials for how to write a server that streams an audio file with unknown length, what the headers need to look like, and especially check with the iOS documentation, because I vaguely remember iOS saying that they will only play media files from servers if they meet certain requirements. If your use case is not supported by the just_audio API, please submit a feature request telling me what additional changes you require in order to support your use case.

Riyan-M commented 9 months ago

Hello @ryanheise,

I've been closely following an ongoing issue in your repository and wanted to seek your expertise on a particular use case that seems challenging. The scenario involves:

A web server dynamically generating audio bytes (similar to OpenAI's Text-to-Speech service) without a predefined byte length, as these bytes are generated on the fly it's impossible to determine what the final length will be.

On the client side, we've tried to implement a CustomAudio source capable of handling a continuous stream of audio. The primary challenge is the uncertainty regarding the total byte length of the streamed audio from the server. In our attempts to address this, we set an arbitrarily large byte length, which somewhat worked but introduced noticeable delays and inconsistencies in the player state. This issue likely stems from the system anticipating additional bytes.

Given the growing relevance of Text-to-Speech technology atm, I believe this use case will become increasingly common. Your package has been immensely valuable, and I would greatly appreciate your thoughts on whether this scenario is feasible within the current framework of the package.

Any guidance or suggestions for handling such dynamic audio streams would be highly beneficial.Thank you for your time and for the exceptional work on this package.

Kind regards.

ryanheise commented 9 months ago

Unfortunately there is no advice I can personally give on how to (effectively) implement an HTTP server that streams audio. What I said in my previous comment was that I would suggest doing some research on that specific task, and if after looking into this, you discover that the StreamAudioSource API lacks some critical feature that you require in order to (effectively) implement your HTTP server, you can submit that feature request. For example, it may turn out that you require some way to handle request/response headers.

So suppose that you first try to build your own HTTP server literally on the server side that streams audio and you learn enough about HTTP and streaming audio to get that working. Once you can do that, you will find that you can either easily do the same type of implementation as a subclass of StreamAudioSource or you'll find some critical feature missing in which case you can submit a feature request.

Riyan-M commented 9 months ago

Hello Ryan,

Thank you for your prompt response. We've successfully implemented a web server that streams audio bytes and confirmed its functionality through a JavaScript-based client on the web. This verifies our server-side solution is working as intended. To clarify our position we can indeed steam bytes and the player does work pretty decently but has buggy behaviour.

However, we're encountering difficulties in replicating this functionality in Flutter. Despite implementing the suggestions you've previously made on similar topics (including those on Stack Overflow), we're still facing issues.

Our main query at this point is: Can we play audio bytes in Flutter if the byte length is unknown? This is under the assumption that our server is correctly implemented, which we've verified by successfully interfacing with a different front-end. We've sunk a lot of time into trying to get just_audio to work with this but it still has issues where it cuts off at the beginning / has unexpected states.

I am willing to provide code snippets and logs for a more detailed look into the errors we're encountering. Your input would be highly valued, especially considering the growing importance of speech-to-speech technologies. Let me know if you see this as a priority for your package.

Looking forward to your guidance.

Best regards.

ryanheise commented 9 months ago

I'm a little confused by your question since you said you successfully implemented a web server that streams audio bytes, but are those bytes of raw PCM audio data, or are they bytes of encoded data? iOS and Android doesn't just play raw PCM data, it can only be fed encoded audio data (e.g. mp3). So you would need to find an audio encoding that is supported on both iOS and Android that supports streaming, and then your server should simply output that stream in whatever way iOS and Android supports. If you have already done that, then just_audio should be able to play it because just_audio uses iOS and Android's native player to play that audio stream.

So when you say you've already built the server, and it works, is it producing valid audio data that is supported on Android and iOS? If not, that's what you would first need to do, because once you've done that, it will work in just_audio.

Riyan-M commented 9 months ago

Sorry for the confusion - I'll try break it down for you a bit and provide code for more context.

To address your initial concerns first: We are using mp3 encoded audio bytes and it is indeed supported by iOS and android as we are getting playback from just_audio

Code to show our implementation:

  final Stream<List<int>>? bytes;
  final contentLength; // Arbitrary large size since we do not know final byte length
  final List<int>? bytesShort;

  MyCustomSource({this.bytes, this.contentLength = 10000000, this.bytesShort});

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    // start ??= 0;
    // end ??= 0; // This can be zero since we're streaming

    // print("START: $start");
    // print("END: $end");

    return StreamAudioResponse(
      rangeRequestsSupported: false,
      // sourceLength: bytesShort?.length, // This might need to be adjusted based on player's behavior
      sourceLength: null,
      contentLength: contentLength,
      // contentLength: null,
      offset: 0,
      stream: bytesShort != null
          ? Stream.value(bytesShort!.sublist(0, bytesShort!.length))
          : bytes!,
      contentType: 'audio/mpeg',
    );
  }
}

NOTE: We are supplying a stream which emits bytes into this MyCustomSource instance. For short bytes we have this bytearray of silent audio which I explain further below:

Issues with come come across:

  1. Audio cuts off at the beginning (after ~800ms the rest plays fine). Our workaround as you can see is that we are playing "silent" bytes to mitigate this but is a bandaid solution.
  2. If we supply audio that's too short to the MyCustomSource instance it won't play at all
  3. We do not know the length of bytes that are server will ultimately produce and thus set contentLength to an arbitrary high number, the issue this causes is that the player is expecting more bytes and this leads to the player state events jumping from idle and buffering which makes using states as part of the workflow difficult.
  4. Buffering in general seems quite slow and this may be due to the initial buffer required may be larger than needed (given all these questions this may be out of the scope for our conversation).

I'm always concerned about taking too much of the author's time as this is an opensource project and it's not your job to educate users on elements external to this package - However we're quite certain we've implemented everything outside of just_audio correctly and aren't quite sure as to why the package isn't able to handle this issue properly.

ryanheise commented 9 months ago
  1. We do not know the length of bytes that are server will ultimately produce and thus set contentLength to an arbitrary high number,

You shared your StreamAudioSource implementation above, but in your actual server implementation (before we even covert that into a StreamAudioSource implementation), are you also doing that same hack? Because that's something I would assume you'd want to figure out how to do properly first before bringing your solution into just_audio's framework.

Audio cuts off at the beginning

On both iOS and Android?

Riyan-M commented 9 months ago

You shared your StreamAudioSource implementation above, but in your actual server implementation (before we even covert that into a StreamAudioSource implementation), are you also doing that same hack? Because that's something I would assume you'd want to figure out how to do properly first before bringing your solution into just_audio's framework.

The server establishes a websocket connection with the client and just sends through the audio bytes. We have to implement this hack because if the overall byte length is too short the player will be stuck in the buffering state. This silent bytes hack isn't done on server side, it's just done in flutter where we detect the bytes as being too short we play this array of silent bytes we have in the app. Importantly what we'd want to figure out is, is just_audio not able to play the bytes if they're too short? If that's the case that may be something contributors may want to look at if streamed audio is a priority for you.

On both iOS and Android?

We've only been experiencing this for iOS, our tests in android (we test way more on iOS mind you) show that it works fine.

DanielEdrisian commented 9 months ago

I want to add my support for @Riyan-M, as well as the dozens of other people who have requested this feature.

ryanheise commented 9 months ago

and you simply decide to not listen.

My comment will also be hidden, but since @DanielEdrisian 's comment was since edited to remove this quote above, I am just quoting it to leave a trail of why @DanielEdrisian 's original comment was hidden.

Riyan-M commented 9 months ago

(I do not condone the above message please don't have me caught in the crossfire 🙉

ryanheise commented 9 months ago

@Riyan-M I'm sorry if I was unclear, but to explain again, what I'm suggesting you do is to learn independently how to implement an HTTP server (not websocket) that does audio streaming, and also in a way that is supported natively by the iOS and Android platforms according to the respective platform's documentation. just_audio wraps around the native platform players for Android and iOS, and so it supports whatever audio encodings those platforms themselves support, and it supports streaming in whatever way the "underlying" platforms permit it, and I have no control over that.

You will need to do some research on your own about the server side techniques for streaming audio "over HTTP" in standard ways that Android and iOS also directly support. There are going to be standard server side techniques out there for streaming audio, but I can't teach them. What just_audio does is "consumes" streams of audio and feeds them to the underlying platform. If the underlying platform recognises those streams, it will play them. It sounds like you still need to do that research.

After having done that research, if you find that my API requires some additional features, such as the ability to set headers, you are welcome to make a feature request for the specific features that you have determined are required for your use case. That would be the most helpful way of contributing to the project.

DanielEdrisian commented 9 months ago

The problem is that you say that StreamAudioResponse can technically take in just an array of bytes with unknown length, regardless of where these bytes come from. It shouldn't matter whether it's from a websocket or an HTTP request. You could just simply say: "Guys, sourceLength has to have a value, and it will only listen to that many values before it cuts off audio". Maybe this is actually a bug on your implementation. But countless times people ask you about this question (on here & stack overflow) and you just respond with "learn http streaming". Instead of just acknowledging that either the messaging on your capabilities is unclear, or that there's serious demand for your StreamAudioResponse library to take unknown-sized audio content.

ryanheise commented 9 months ago

@DanielEdrisian everybody else is trying to contribute in a constructive way. If you have identified a bug, the way you can contribute constructively is to submit a bug report in accordance with the contributing guidelines linked at the bottom of this page.

artyomkonyaev commented 9 months ago

Hello @Riyan-M,

I am currently facing the same exact issue that you've been discussing here. Have you been able to solve this problem?

I would grateful if you could share more details about your current approach, possibly with code snippets. Even if you are utilizing something else than just_audio (perhaps such knowledge could later help improve this package's capabilities).

Thank you.

mahdi-rafiei commented 8 months ago

Hello @Riyan-M and @artyomkonyaev,

I'm encountering a similar issue where the player doesn't request the next data chunk after finishing the first one, whose source length is unknown.

Have you found any solution for that?

I've meticulously reviewed all the comments in this thread and I'm perplexed. Either I'm missing something obvious, or perhaps @ryanheise might have overlooked some details in @Riyan-M's messages as @DanielEdrisian also mentioned.

In my specific situation, I'm working with a encrypted audio file, so I'm aware of its length, which helps me to some extent. However, this might cause huge issues for those dealing with sources of unknown lengths.

I experimented with a potential solution by setting the sourceLength to null as long as the stream remains open. Once the stream closes, I then send the actual source length with the new range request.

Interestingly, this approach led to erratic behavior in the position stream. When the audio is playing, it consistently shows a position of 0. If I pause the playback, it then reflects the correct position. Resuming playback reverts it back to showing 0, creating a continuous cycle of inaccurate position reporting...

ryanheise commented 8 months ago

I've meticulously reviewed all the comments in this thread and I'm perplexed. Either I'm missing something obvious, or perhaps @ryanheise might have overlooked some details in @Riyan-M's messages as @DanielEdrisian also mentioned.

I have never taught anyone how to build their apps, I have only provided the API that wraps HTTP, and I have only ever suggested that you learn elsewhere whatever you need to know to build an HTTP server. Once you find out how to build your particular HTTP server, then you can take a look at the StreamAudioSource API and see if it has all of the HTTP mappings that you require in order to implement your solution. If it does not, you can submit a very specific feature request for the missing headers/etc that you need me to implement.

As an experimental API, there is always the possibility that you are trying to do something that depends on features of HTTP that is not included in the API. If that is the case, you will need to become informed about what those are and then request them.

PeperMarkreel commented 7 months ago

Anyone here has a working solution and willing to share?

I am currently working on my second project that I'd like to use this functionality and remember the days of trial and error of the previous trial to implement this. It's specifically streaming from OpenAI TTS. Probably will do a deep dive in the quirks of the iOS player http streaming requests if no one has something.

decisionslab2 commented 6 months ago

I'm encountering the same issue on iOS devices. Is there any fix?

nialljawad96 commented 5 months ago

@Riyan-M Hi Riyan, did you happen to solve this at all?