issues
search
YuanGongND
/
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
1.17k
stars
221
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
question about normalization
#141
emleeee
opened
4 days ago
0
Issue while trying to execute the Speechcommands V2 Recipe
#140
alexandramarkal
closed
3 weeks ago
1
please remove this log
#139
Toyoill
closed
1 month ago
1
Install requirement issue
#138
SunnyGao0901
opened
2 months ago
3
how to reproduce the same result based on my custom dataset?
#137
jmren168
opened
2 months ago
0
How to incrementally fine-tune AST model with the new data?
#136
jmren168
opened
3 months ago
0
AttributeError: module 'numpy.typing' has no attribute 'NDArray'
#135
nikhilsos
closed
4 months ago
3
Discrepancy in Model Performance Using HuggingFace Pipeline Utility
#134
penguinwang96825
opened
5 months ago
5
AssertionError: choose a window size 400 that is [2, 1]
#133
GrafKnusprig
opened
5 months ago
2
csv error
#132
mooncv
opened
6 months ago
1
some questions when reproducing your results
#131
ben100118
opened
6 months ago
2
Ask for help
#130
Ingram-lin
opened
6 months ago
1
self-contained Google Colab script error
#129
moon-aver
opened
7 months ago
2
Inquiry Regarding Audio Spectrogram Transformer
#128
Ingram-lin
opened
7 months ago
2
One question regarding the linear projection of AST.
#127
poult-lab
closed
5 months ago
1
training MAP
#126
maxwZJU
opened
8 months ago
2
Different audio sample size for fine-tuning the model gives overfitting issue
#125
aarshilpatel
closed
8 months ago
1
When I download the pretrained model with stride=16, I need to change `fstride` and `tstride` in the source code from 10 to 16. Besides these changes, what else do I need to adjust?
#124
zky-66
closed
8 months ago
0
Installing requirements issues
#123
BehrouzGit
opened
8 months ago
1
ESC-50-master zip file location has changed
#122
BehrouzGit
closed
8 months ago
2
How can I adapt the pretrained AST model to fit my own dataset
#121
zky-66
closed
8 months ago
6
Fine tuning AST model to Music Emotion Classification Overfit
#120
moonquekes
opened
10 months ago
4
CPU memory increase while training
#119
gudrb
opened
10 months ago
6
After fine-tune a 3-class dataset, how to load its fine-tuned weighted to update pre-trained ast model?
#118
jmren168
opened
10 months ago
7
seq2seq classification with AST
#117
YSLCoat
opened
12 months ago
2
AST Audioset Training Time and Hardware
#116
justinluong
closed
1 year ago
2
how to use my own dataset
#115
wlssyuu
opened
1 year ago
3
Installing requirement and CUDA on a fresh virtual environnement
#114
blackyx35
opened
1 year ago
1
For own data
#113
ShadowVicky
opened
1 year ago
1
RuntimeError: DataLoader worker (pid 39424) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.
#112
wuhongsheng
opened
1 year ago
1
ERROR: Cannot install -r requirements.txt (line 10)
#111
wuhongsheng
closed
1 year ago
1
Regression Task
#110
BaMarcy
opened
1 year ago
1
Huggingface-compatible ImageNet pre-trained weights
#109
penguinwang96825
closed
1 year ago
5
pre-processing about AudioSet (resample to 16kHz)
#108
wisekimm
opened
1 year ago
13
What is the objective when pretraining?
#107
Young973
opened
1 year ago
3
Multichannel Audio Input
#106
aaprasad
opened
1 year ago
0
How to convert fbank tensor back to waveform?
#105
ebagdasa
opened
1 year ago
0
Missing "esc_class_labels_indices.csv" file
#104
kaiw7
closed
1 year ago
2
Application on ASR
#103
chlorane
opened
1 year ago
0
Epoch: [4][160156/161048] training diverged...
#102
xiaoli1996
opened
1 year ago
3
Using pretrained model for embeddings extraction with audio input samples of different durations.
#101
sreenivasaupadhyaya
opened
1 year ago
5
About AST for Speech Enhancement
#100
kaiw7
opened
1 year ago
5
ast input audio length
#99
syjunghwang
closed
1 year ago
0
How to use the pre-trained model on the AudioSet to extract audio features and save them as npy?
#98
ayameyao
opened
1 year ago
2
Colab notebook for inference is only partly usable with CPU
#97
Cycerotki
opened
1 year ago
3
Performance issues with recorded voices
#96
milad-s5
opened
1 year ago
5
In my own dataset, why is the Avg precision always 0.5 in each epoch?
#95
1244547821
opened
1 year ago
1
SpeechCommands v2
#94
yunzqq
opened
1 year ago
4
Training with Unbatch data!
#93
fallahim
opened
1 year ago
0
Dealing with different audio lengths
#92
OhadCohen97
opened
1 year ago
1
Next