-
"As online generative AI platforms such as ‘ChatGPT’ continue to offer consumers the unrestricted ability to create text and images, including video and audio from a variety of inputs, [it is importan…
-
## 🚀 Feature
Add new audio metrics for generative audio processing
### Motivation
The evaluation of speech processing (denoising, dereverberation and in general enhancement) highly depends o…
-
## Chinatown.js Talk Submission
**Talk Title:**
Authorship Without Control: The role of chaos and javascript in generative music
**Talk Description:**
Through thinking about the web browse…
-
### Is your feature request related to a problem? Please describe.
The process to create test cases for the CoE Starter Kit Setup and Upgrade process is a manual process. Make use if Generative AI pr…
-
### Discussed in https://github.com/orgs/langfuse/discussions/4363
Originally posted by **pleomax0730** November 21, 2024
## Current observation
- [Image Trace](https://cloud.langfuse.com/pro…
-
Many users face limitations in manipulating and enhancing audio recordings obtained through microphones. Traditional methods may lack precision or require extensive manual effort.
So as a solution I …
-
(1)第一阶段的输入在论文中是使用参考帧,音频和目标帧
![image](https://github.com/user-attachments/assets/92c7ddfd-df7f-4ece-a1cc-b557ab4e5824)
但是现在的代码好像还是hallo1的:https://github1s.com/fudan-generative-vision/hallo2/blob/HE…
-
**Describe the bug**
When looking through the test for the Google Gemini AI I noticed that the test called should_support_video_file does not test a video but rather an audio upload. It is just…
-
## Goal
Experiment on WhisperVQ model for better result on multilingual. Hypothesis the current codebook is only 512 which is a small space to compress the multilingual capability.
## Learning Goa…
-
## 一言でいうと
GANを音声に適用した研究。音声ベース(WaveGAN)と、スペクトログラムベース(SpecGAN)の2種類を提案している。音声は周期性があり特徴をとらえるには長い幅が必要なため、1次元のフィルタ(サイズ25)で、画像より大きい指数(4)をupsamplingに使用している。音質はWave、印象はSpecの方が良いという結果。
### 論文リンク
https:…