wAIfu-DEV / w-AI-fu_v2

Best way to create your own AI Vtuber/Streamer ! (Openai or NovelAI)
https://www.youtube.com/@Hilda-AI-VTuber
GNU General Public License v3.0
8 stars 3 forks source link

BUG: Sing command, works with strange behavior #45

Closed Excelsus999 closed 8 months ago

Excelsus999 commented 8 months ago

I tried making the AI sing and now it doesn't work with new songs added, but for some reason it works with certain songs that were in the previous version of the app.

pifhniouebhoiwuebgowyiebgoyewy9wiuobowguebhogwubeo87wb8o8uregwiuow róoeiqjg'0v9ju08tv05279bvyt2879b89t7qb987qbt87q3bvt80q79b4qv79b93q00qbv73b0qv739bv3q79bv4

https://github.com/wAIfu-DEV/w-AI-fu_v2/assets/149912072/c0d67892-9242-45d6-b199-c061d4f7109b

wAIfu-DEV commented 8 months ago

I think it might be due to a change I made recently, .mp3 files may not work, should work with .wav files

Excelsus999 commented 8 months ago

Manage to solve the issue, but still problematic, in order for the AI to sing through the wAIfu app, both samples must be: A: Wave files B: have the same bitrate C: The bit rate of both samples must be 1411 kbps or else it will not work. I tried it with many file extensions and bitrates, mp3, ogg, flac, M4A, etc... nothing works except for wave files with that specific bitrate. and this still can issues with synchronization, one time the songs would be super desynchronize and the next time is fine.

Added a video too: https://drive.google.com/file/d/1ZBD95EMi3z0oaEH0kJ_HvbYQj2Db-RXu/view?usp=drive_link

Update: tried it with another song with same requirements but different lengths in the instrumental and vocals, it trigger the error, both files must be 1:1 as well

wAIfu-DEV commented 8 months ago

About the issue with synchronization, could you edit the file source/app/singing/sing.py and replace the contents with this and tell me if it works for you?

import os
import sys
import time
import wave
import os

import pyaudio

proc_id = None
other_id = None

CHUNK_SIZE = 8192

audio = pyaudio.PyAudio()

def get_current_time():
    return time.time_ns() // 1000000

def play_wav(filename, device):
    global proc_id, other_id, CHUNK_SIZE

    # Open the wave file
    with wave.open(filename, 'rb') as wave_file:
        # Open a stream for capturing audio from the virtual audio cable
        audio_stream = None

        try:
            audio_stream = audio.open(format=audio.get_format_from_width(wave_file.getsampwidth()),
                                            channels=wave_file.getnchannels(),
                                            rate=wave_file.getframerate(),
                                            frames_per_buffer=CHUNK_SIZE,
                                            output=True,
                                            output_device_index=device) # Set the input device index to the virtual audio cable
        except Exception as e:
            print('Cannot use selected audio device as output audio device.', file=sys.stderr)
            return False

        desync_accumulator = 0
        desync_array = []
        iters = 0

        await_sync(proc_id, other_id)

        start_time = get_current_time()
        framerate = wave_file.getframerate()

        data = wave_file.readframes(CHUNK_SIZE)
        while data:
            iters += 1
            playback_time = get_current_time() - start_time
            audio_stream.write(data)

            target_time = wave_file.tell() / framerate * 1000
            drift = playback_time - target_time
            abs_drift = abs(drift)

            desync_accumulator += abs_drift
            desync_array.append(abs_drift)

            frames_to_read = max(1, CHUNK_SIZE + int(drift * 2))
            data = wave_file.readframes(frames_to_read)

        # Clean up resources
        audio_stream.stop_stream()
        audio_stream.close()

        desync_array.sort()
        median = desync_array[int(len(desync_array) / 2)]

        print(f"Player {proc_id}: finished playing: {filename}. desync average(ms): {int(desync_accumulator / iters)}. desync median(ms): {int(median)}")

def await_sync(id, other_id):
    sync1 = f'sync{id}.lock'
    sync2 = f'sync{other_id}.lock'
    with open(sync1,'w') as f:
        f.write('')
    while not os.path.exists(sync2):
        pass
    return

if __name__ == '__main__':
    file = sys.argv[1]
    device = int(sys.argv[2])
    proc_id = int(sys.argv[3])
    other_id = 0 if proc_id == 1 else 1

    play_wav(file, device)
Excelsus999 commented 8 months ago

I tried many songs with the code, it improves the synchronization a lot, even if it is not perfect, it is close, I have tried some settings with the general audio that are very consistent with many songs, at the expense of some quality, why? ? God knows why

Usually works better after the second time you play a song and that's for the playlist as well

saa aósiu0asipub08nio

I use voice meter banana, maybe is that? might change to general VB cable A+B and test it

askjlaskj DfSDFDSF

imagen_2024-01-18_230557816

https://github.com/wAIfu-DEV/w-AI-fu_v2/assets/149912072/230cfa81-5b64-4c7b-a5be-0ab17ec16a7b