-
### Feature request
Enable PPOTrainer and DPOTrainer to work with audio-language models like Qwen2Audio. Architecture for this model is identical to vision-language models like LlaVa, consisting of…
-
Hi,
Thank you for the great work and the detailed documentation you have provided. It's been very helpful.
I'm trying to use the 13B model instead of the default 7B model. I downloaded the 13B m…
-
```
Traceback (most recent call last):
File "/home/jeff/.local/lib/python3.10/site-packages/gradio/queueing.py", line 541, in process_events
response = await route_utils.call_process_api(
…
-
@ZiqiaoPeng 作者是否可以开源下audio_visual_encoder.pth这个模型的训练代码,这边采用中文训练后嘴唇拟合度不是很高,想用中文数据集重新训练下audio_visual_encoder
-
Hi,
with the last update 2.42.0 (and 2.42.1), I follow the recommandation :
> It is highly recommended to start with new/fresh settings, especially because of the change in audio handling! You can a…
-
It would be good to implement a silence detector function on the audio encoders.
If the input audio drops below a certain configurable level of a certain configurable time (allowing for noise spikes)…
-
First of all, I have only just begun trying out Captura. And so far, I think it is a great front-end for screen capture with FFmpeg. Thank you very much for this great software.
I would like to be …
-
shutter-encoder does not preserve audio tracks when encoding AV1. It necessarily re-encodes the audio tracks in OPUS without giving the user the possibility of storing them without re-encoding
-
Hi sir:
If i want to use q-former for projector in acc audiocaps, the length of the audio encoder Placeholder should set to 64?
-
## Environment
- **OS:** Arch linux 6.9.9-zen1-1-zen
- **Scrcpy version:** 2.5
- **Installation method:** paru -S
- **Device model:** [Xiaomi] POCO 23013PC75G (Android 14)
- **Android vers…