issues
search
YuanGongND
/
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.06k
stars
203
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Discrepancy in Model Performance Using HuggingFace Pipeline Utility
#134
penguinwang96825
opened
1 week ago
5
AssertionError: choose a window size 400 that is [2, 1]
#133
GrafKnusprig
opened
1 week ago
2
csv error
#132
mooncv
opened
1 month ago
1
some questions when reproducing your results
#131
ben100118
opened
1 month ago
2
Ask for help
#130
Ingram-lin
opened
1 month ago
1
self-contained Google Colab script error
#129
moon-aver
opened
2 months ago
2
Inquiry Regarding Audio Spectrogram Transformer
#128
Ingram-lin
opened
2 months ago
2
One question regarding the linear projection of AST.
#127
poult-lab
closed
3 weeks ago
1
training MAP
#126
maxwZJU
opened
3 months ago
2
Different audio sample size for fine-tuning the model gives overfitting issue
#125
aarshilpatel
closed
3 months ago
1
When I download the pretrained model with stride=16, I need to change `fstride` and `tstride` in the source code from 10 to 16. Besides these changes, what else do I need to adjust?
#124
zky-66
closed
3 months ago
0
Installing requirements issues
#123
BehrouzGit
opened
3 months ago
0
ESC-50-master zip file location has changed
#122
BehrouzGit
closed
3 months ago
2
How can I adapt the pretrained AST model to fit my own dataset
#121
zky-66
closed
3 months ago
6
Fine tuning AST model to Music Emotion Classification Overfit
#120
moonquekes
opened
5 months ago
2
CPU memory increase while training
#119
gudrb
opened
5 months ago
6
After fine-tune a 3-class dataset, how to load its fine-tuned weighted to update pre-trained ast model?
#118
jmren168
opened
5 months ago
7
seq2seq classification with AST
#117
YSLCoat
opened
7 months ago
2
AST Audioset Training Time and Hardware
#116
justinluong
closed
8 months ago
2
how to use my own dataset
#115
wlssyuu
opened
9 months ago
3
Installing requirement and CUDA on a fresh virtual environnement
#114
blackyx35
opened
9 months ago
1
For own data
#113
ShadowVicky
opened
9 months ago
1
RuntimeError: DataLoader worker (pid 39424) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.
#112
wuhongsheng
opened
10 months ago
1
ERROR: Cannot install -r requirements.txt (line 10)
#111
wuhongsheng
closed
10 months ago
1
Regression Task
#110
BaMarcy
opened
10 months ago
1
Huggingface-compatible ImageNet pre-trained weights
#109
penguinwang96825
closed
9 months ago
5
pre-processing about AudioSet (resample to 16kHz)
#108
wisekimm
opened
11 months ago
13
What is the objective when pretraining?
#107
Young973
opened
11 months ago
3
Multichannel Audio Input
#106
aaprasad
opened
12 months ago
0
How to convert fbank tensor back to waveform?
#105
ebagdasa
opened
1 year ago
0
Missing "esc_class_labels_indices.csv" file
#104
kaiw7
closed
1 year ago
2
Application on ASR
#103
chlorane
opened
1 year ago
0
Epoch: [4][160156/161048] training diverged...
#102
xiaoli1996
opened
1 year ago
3
Using pretrained model for embeddings extraction with audio input samples of different durations.
#101
sreenivasaupadhyaya
opened
1 year ago
5
About AST for Speech Enhancement
#100
kaiw7
opened
1 year ago
5
ast input audio length
#99
syjunghwang
closed
1 year ago
0
How to use the pre-trained model on the AudioSet to extract audio features and save them as npy?
#98
ayameyao
opened
1 year ago
2
Colab notebook for inference is only partly usable with CPU
#97
Cycerotki
opened
1 year ago
3
Performance issues with recorded voices
#96
milad-s5
opened
1 year ago
5
In my own dataset, why is the Avg precision always 0.5 in each epoch?
#95
1244547821
opened
1 year ago
1
SpeechCommands v2
#94
yunzqq
opened
1 year ago
4
Training with Unbatch data!
#93
fallahim
opened
1 year ago
0
Dealing with different audio lengths
#92
OhadCohen97
opened
1 year ago
1
How to configure the dataset or modify the code if I want to do the one class binary classification
#91
nanyyyyyy
opened
1 year ago
14
Can AST be used for audio representation towards solving the frame-level classification tasks?
#90
SylviaZiyaZhou
opened
1 year ago
4
Cannot create a file when that file already exists: './exp/test-esc50-f10-t10-impTrue-aspTrue-b48-lr1e-5/fold1/models'
#89
JonathanFL
closed
1 year ago
4
About code"100-106" from dataloader.py
#88
poult-lab
opened
1 year ago
6
Audio length 1s
#87
9B8DY6
closed
1 year ago
5
Q: Audio-MAE reports that AST's performance on VoxCeleb1 is 41.1%, but not listed on the AST paper
#86
daisukelab
closed
1 year ago
2
The problem of reproducing the AST result in full dataset
#85
MichaelLynn1996
opened
1 year ago
9
Next