issues
search
YuanGongND
/
ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
389
stars
36
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question about different prompts
#53
maxwZJU
opened
1 month ago
0
Loading the CAV-MAE model
#52
HuangZiliAndy
opened
1 month ago
0
length limit for input audio
#51
BongkiLee
opened
2 months ago
0
How to replace the audio encoder in the model?
#50
zhangron013
closed
2 months ago
2
Inference of 13B (Beta)
#49
nicolaus625
opened
2 months ago
3
Question: About the OpenAQA dataset.
#48
wangwen-banban
closed
2 months ago
2
Question: Where is the setting to freeze the backbone LLM like LLaVA?
#47
Kurt232
opened
2 months ago
1
Question: Would you meet the output of LTU is poor quality in the early stage?
#46
Kurt232
opened
2 months ago
1
why pad_or_trim use 1000 rather than 3000 when transcribe_audio?
#45
peggyxpxu
opened
4 months ago
1
Where use whisper decoder?
#44
peggyxpxu
opened
4 months ago
2
Why loss is always 0?
#43
yangdongdong2000
opened
4 months ago
0
Stage training sh scripts for low resource
#42
yangdongdong2000
opened
4 months ago
1
where to download whisper model?
#41
yangdongdong2000
opened
4 months ago
2
Error when run finetune_toy_low_resource.sh
#40
blue-blue272
opened
4 months ago
0
About the audio-text pair of AudioSet dataset.
#39
blue-blue272
opened
4 months ago
1
Question: Half Float Inference?
#38
IanZ2020
opened
5 months ago
1
train_scripts
#37
yangdongdong2000
opened
5 months ago
3
Modifications to the llama model
#36
peggyxpxu
opened
6 months ago
1
Question: LLaMA-7B LLM
#35
peggyxpxu
opened
6 months ago
2
Question:Why are the prompts for training and inference for audio event classification are different?
#34
peggyxpxu
opened
6 months ago
2
OpenAQA Dataset's audio files
#33
CleyLyChen
opened
6 months ago
2
question on cutoff_len
#32
BenoitWang
opened
7 months ago
1
Eval code error
#31
peggyxpxu
opened
7 months ago
4
LICENSE of AQA datasets and checkpoints
#30
joemzhao
opened
7 months ago
0
Eval_metrics
#29
joemzhao
closed
7 months ago
4
Question about vicuna version
#28
CleyLyChen
closed
7 months ago
2
Question about Finetune exp
#27
doubleHon
opened
7 months ago
4
Issue with Loading 13B Model: Size Mismatch Error
#26
EnisBerk
opened
7 months ago
4
Batch Inference Support
#25
EnisBerk
closed
8 months ago
0
Maximux Length for LTU-AS Audio Input
#24
dingdongwang
opened
9 months ago
1
CPU local inference is not working.
#23
vivekupadhyay1
opened
9 months ago
0
Question about model loading in inference
#22
dingdongwang
opened
9 months ago
2
Issue while loading openaqa_5.6M.json
#21
sonalkum
opened
9 months ago
4
Running Issue about Low-Resource Training for LTU-AS
#20
dingdongwang
opened
9 months ago
8
Question about Multi-GPU Training
#19
dingdongwang
opened
9 months ago
1
Question about LTU-AS base model
#18
dingdongwang
opened
9 months ago
1
Question about LTU-AS Downstream Tasks
#17
dingdongwang
opened
9 months ago
3
LTU_AS ASR Task
#16
dingdongwang
opened
9 months ago
4
extract_whisper_feature.py
#15
dingdongwang
opened
9 months ago
3
Missing Checkpoints
#14
Sreyan88
closed
9 months ago
8
Missing Tokenize Audio Info during Fine-tuning/Training
#13
dingdongwang
opened
10 months ago
1
How to process audio that exceeds 10 seconds in length
#12
qisawO3
opened
10 months ago
5
no evaluation script for open-set problem
#11
alexaway
opened
10 months ago
1
whisper-at on cuda:1
#10
alexanderwerning
opened
10 months ago
1
Model Parallelization
#9
BhashaBluff
opened
10 months ago
5
Requirements for more pretrained weights
#8
Ming-er
closed
10 months ago
5
vicuna_ltu model file missing
#7
zengxijuan
opened
11 months ago
1
Which model is 7B (Default) and which is 13B (Beta)?
#6
yl4579
opened
11 months ago
12
Question about the Realism of Simulated Acoustic Event Combinations in Data Generation
#5
haoxiangsnr
opened
11 months ago
2
About the experimental results of the paper LTU-AS
#4
yangyuxiang1996
opened
1 year ago
1
Next