PythonistaGuild / TwitchIO

An Async Bot/API wrapper for Twitch made in Python.
https://twitchio.dev
MIT License
798 stars 163 forks source link

player.play() doesn't correctly handle sound files with non-default format #358

Closed iarspider closed 2 weeks ago

iarspider commented 1 year ago

I'm trying to use the new Sounds ext, and I have noticed that playing mp3 files with only one channel produces weird result - I'd say the file is played at 2x speed, or the pitch is shifted up. Converting the file to stereo fixes the issue.

MediaInfo output for mono file:

General
Complete name                            : Minion General Speech@ignore@ForMe01.mp3
Format                                   : MPEG Audio
File size                                : 25.3 KiB
Duration                                 : 1 s 619 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 128 kb/s
Genre                                    : Other

Audio
Format                                   : MPEG Audio
Format version                           : Version 1
Format profile                           : Layer 3
Duration                                 : 1 s 620 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 1 channel
Sampling rate                            : 44.1 kHz
Frame rate                               : 38.281 FPS (1152 SPF)
Compression mode                         : Lossy
Stream size                              : 25.3 KiB (100%)

MediaInfo output for stereo file:

General
Complete name                            : Minion General Speech@ignore@ForMe01.mp3
Format                                   : MPEG Audio
File size                                : 26.0 KiB
Duration                                 : 1 s 645 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 128 kb/s
Genre                                    : Other
Writing library                          : LAME3.100

Audio
Format                                   : MPEG Audio
Format version                           : Version 1
Format profile                           : Layer 3
Format settings                          : Joint stereo / MS Stereo
Duration                                 : 1 s 646 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 kHz
Frame rate                               : 38.281 FPS (1152 SPF)
Compression mode                         : Lossy
Stream size                              : 25.7 KiB (99%)
Writing library                          : LAME3.100
github-actions[bot] commented 1 year ago

Hello! Thanks for the issue. If this is a general help question, for a faster response consider joining the official Discord Server

Else if you have an issue with the library please wait for someone to help you here.

iarspider commented 1 year ago

I don't have any 5.1 or 7.1 files, but I would guess they will also be handled incorrectly, maybe the pitch will shift down?

iarspider commented 1 year ago

So, a small update: player.play() expects the sound file to be in one specific format: 44.1 kHz, 128 kb/s, 2 channels. If any of the parameters are off, the sound will not be played correctly. I think (but don't have time to test it) that adding

-ab 128 -ar 44100 -ac 2

before pipe:1 in ffmpeg invocation here and maybe here will help.

plomdawg commented 1 year ago

can confirm - my mono .mp3s also play super fast

image

until I changed the lines as suggested above:

            self.proc = subprocess.Popen(
                [
                    ffmpeg_bin,
                    "-i",
                    source,
                    "-loglevel",
                    "panic",
                    "-vn",
                    "-f",
                    "s16le",
                    "-ab",
                    "128",
                    "-ar",
                    "44100",
                    "-ac",
                    "2",
                    "pipe:1",
                ],
sockheadrps commented 1 year ago

This is where the sample rate is hard coded. It just appears like this just isnt finished, and local files havnt been fully covered yet. Idk if theres a better way, maybe try to read in sample rate from meta data on audio, but I made a PR to at least allow you to set the values of those things that way we dont have to change the library code anymore lol

enekochan commented 1 year ago

The actual rate right now for the audio files must be converted to 48000 and not 44100. I don't think this is mentioned in the docs anywhere and is not trivial find this solution.

Is there a problem to merge the PR by @sockheadrps?

IAmTomahawkx commented 1 year ago

Is there a problem to merge the PR by @sockheadrps?

I forgot it existed :) There's a couple meta issues with the pr, but once they've been fixed it can be merged

sockheadrps commented 4 months ago

I think this issue can be closed since #454 addressed the problem in this issue by detecting sample rate and channels from the audio meta data, and exposing both properties with setters just in case the meta data isn’t accurate.

iarspider commented 3 months ago

Thanks for looking into it. With this update however, I'm getting weird results - if I play two sounds in a row, the second one isn't played, and holds the file open for eternity. I will try to post a reproducer later.

iarspider commented 3 months ago

And here is the reproducer:

import asyncio
import os
import time

import eyed3

from pathlib import Path

from twitchio.ext import commands, sounds

class Bot(commands.Bot):
    def play_sound(self, sound: str):
        soundfile = str(Path(__file__).parent / sound)
        sound = sounds.Sound(soundfile)

        print("play sound", soundfile)

        self.player.play(sound)

        duration = eyed3.load(soundfile).info.time_secs
        time.sleep(duration)
        print(f"slept for {duration}s")

    def __init__(self, initial_channels=None):
        super().__init__(
            token=os.getenv("TWITCH_CHAT_PASSWORD"),
            client_id=os.getenv("TWITCH_CHAT_CLIENT_ID"),
            nick="arachnobot",
            prefix="!",
            initial_channels=["#iarspider"],
        )

        self.player = sounds.AudioPlayer(callback=self.player_done)

    async def event_ready(self):
        print(f"Ready | {self.nick}")
        self.play_sound("ding-sound-effect_1.mp3")
        self.play_sound("ding-sound-effect_1.mp3")

    async def player_done(self):
        print("Player done")
        pass

async def main():
    global client, twitch_bot
    twitch_bot = Bot()
    await twitch_bot.start()

if __name__ == "__main__":
    asyncio.run(main())

(only extra dependency is eyed3 to get sound file duration, can replace with fixed-duration). File used: ding-sound-effect_1.mp3

This outputs:

Ready | arachnobot
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
Player done

and plays ding only once.

iarspider commented 3 months ago

This worked fine before. The Bot.play_sound function in my "production" code is used also to play temporary Text-to-Speech mp3 files, and the waiting is mostly used to remove those temporary files once they are done playing.

iarspider commented 3 months ago

This one works properly:

import asyncio
import os
import time

from pathlib import Path

# from dotenv import load_dotenv
from twitchio.ext import commands, sounds

class Bot(commands.Bot):
    async def play_sound(self, sound: str):
        soundfile = str(Path(__file__).parent / sound)
        sound = sounds.Sound(soundfile)

        print("wait for lock")
        await self.lock.acquire()
        print("play sound", soundfile)
        self.player.play(sound)

    def __init__(self, initial_channels=None):
        super().__init__(
            token=os.getenv("TWITCH_CHAT_PASSWORD"),
            client_id=os.getenv("TWITCH_CHAT_CLIENT_ID"),
            nick="arachnobot",
            prefix="!",
            initial_channels=["#iarspider"],
        )

        self.lock = asyncio.Lock()
        self.player = sounds.AudioPlayer(callback=self.player_done)

    async def event_ready(self):
        print(f"Ready | {self.nick}")
        await self.play_sound("ding-sound-effect_1.mp3")
        await self.play_sound("ding-sound-effect_1.mp3")

    async def player_done(self):
        print("player done")
        self.lock.release()
        pass

async def main():
    global client, twitch_bot
    twitch_bot = Bot()
    await twitch_bot.start()

if __name__ == "__main__":
    # load_dotenv()
    asyncio.run(main())
sockheadrps commented 2 months ago

Thanks for looking into it. With this update however, I'm getting weird results - if I play two sounds in a row, the second one isn't played, and holds the file open for eternity. I will try to post a reproducer later.

Here is the commit for the work I added, it's pretty minimal. https://github.com/PythonistaGuild/TwitchIO/pull/454/commits/0c46ba733addd75367384de536356526f75cbab9

Are you sure this isnt an issue with other parts of your code? I dont think I see necessarily where anything I added could have caused anything like this to happen.

If you look at the output of your code

Ready | arachnobot
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
play sound e:\Temp\4\ding-sound-effect_1.mp3
slept for 2.95s
Player done

You're only getting the callback once, which seems a bit odd, but looking further, at the code you're experiencing a bug in, you're using time.sleep, which is a blocking function call for one, but also in the event_ready() function definition, you arent signaling anything in the callback function you passed into the sounds.AudioPlayer object that the audio has completed before loading and playing new audio, so it looks like what's causing the issue is that youre playing the first audio file, and then immediately loading another sound file into it, causing the references the first file made to AudioPlayer.play() to be overwritten.

Thats my guess anyways, I didnt test your code, but Im pretty certain it has nothing to do with the merge of this PR, I think it has to do with how youre handling the AudioPlayer itself. I think you should be checking a flag or something set in the callback function before you load and play any other sounds to avoid this kind of issue. Thats how I handle it anyways.

Hope that's helpful, sorry for the late response, I hadn't noticed you had mentioned anything in this issue. Any time you need help with something theres a help thread channel in Twitchio's discord, and any new bugs would be better addressed by opening new issues. No worries, I just feel bad your comment wasnt addressed for more than a month. @iarspider

edit: Theres multiple ways to handle playing audio sequentially, I also have a PR to implement a feature to provide an solution that makes this kind of functionality simpler to work with. I wrote it to close #462 and the maintainers are just really busy working on 3.x and have not merged it yet, its awaiting a final review.

If you would like to look at solutions you can implement yourself in the interim, this is how I handle it in my TTS cog

Its a little messy, sorry. Its personal use project, but is fairly simple.

iarspider commented 2 weeks ago

@sockheadrps Sorry for not replying sooner as well. From a quick look at your code, looks you are handling things in a similar manner (just using global variable instead of lock). I like your implementation of sound queue, btw - nice work!