YuanGongND ast issues - Githubissues

YuanGongND / ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

BSD 3-Clause "New" or "Revised" License

1.17k stars 221 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

question about normalization

#141 emleeee opened 4 days ago
0
Issue while trying to execute the Speechcommands V2 Recipe

#140 alexandramarkal closed 3 weeks ago
1
please remove this log

#139 Toyoill closed 1 month ago
1
Install requirement issue

#138 SunnyGao0901 opened 2 months ago
3
how to reproduce the same result based on my custom dataset?

#137 jmren168 opened 2 months ago
0
How to incrementally fine-tune AST model with the new data?

#136 jmren168 opened 3 months ago
0
AttributeError: module 'numpy.typing' has no attribute 'NDArray'

#135 nikhilsos closed 4 months ago
3
Discrepancy in Model Performance Using HuggingFace Pipeline Utility

#134 penguinwang96825 opened 5 months ago
5
AssertionError: choose a window size 400 that is [2, 1]

#133 GrafKnusprig opened 5 months ago
2
csv error

#132 mooncv opened 6 months ago
1
some questions when reproducing your results

#131 ben100118 opened 6 months ago
2
Ask for help

#130 Ingram-lin opened 6 months ago
1
self-contained Google Colab script error

#129 moon-aver opened 7 months ago
2
Inquiry Regarding Audio Spectrogram Transformer

#128 Ingram-lin opened 7 months ago
2
One question regarding the linear projection of AST.

#127 poult-lab closed 5 months ago
1
training MAP

#126 maxwZJU opened 8 months ago
2
Different audio sample size for fine-tuning the model gives overfitting issue

#125 aarshilpatel closed 8 months ago
1
When I download the pretrained model with stride=16, I need to change `fstride` and `tstride` in the source code from 10 to 16. Besides these changes, what else do I need to adjust?

#124 zky-66 closed 8 months ago
0
Installing requirements issues

#123 BehrouzGit opened 8 months ago
1
ESC-50-master zip file location has changed

#122 BehrouzGit closed 8 months ago
2
How can I adapt the pretrained AST model to fit my own dataset

#121 zky-66 closed 8 months ago
6
Fine tuning AST model to Music Emotion Classification Overfit

#120 moonquekes opened 10 months ago
4
CPU memory increase while training

#119 gudrb opened 10 months ago
6
After fine-tune a 3-class dataset, how to load its fine-tuned weighted to update pre-trained ast model?

#118 jmren168 opened 10 months ago
7
seq2seq classification with AST

#117 YSLCoat opened 12 months ago
2
AST Audioset Training Time and Hardware

#116 justinluong closed 1 year ago
2
how to use my own dataset

#115 wlssyuu opened 1 year ago
3
Installing requirement and CUDA on a fresh virtual environnement

#114 blackyx35 opened 1 year ago
1
For own data

#113 ShadowVicky opened 1 year ago
1
RuntimeError: DataLoader worker (pid 39424) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.

#112 wuhongsheng opened 1 year ago
1
ERROR: Cannot install -r requirements.txt (line 10)

#111 wuhongsheng closed 1 year ago
1
Regression Task

#110 BaMarcy opened 1 year ago
1
Huggingface-compatible ImageNet pre-trained weights

#109 penguinwang96825 closed 1 year ago
5
pre-processing about AudioSet (resample to 16kHz)

#108 wisekimm opened 1 year ago
13
What is the objective when pretraining?

#107 Young973 opened 1 year ago
3
Multichannel Audio Input

#106 aaprasad opened 1 year ago
0
How to convert fbank tensor back to waveform?

#105 ebagdasa opened 1 year ago
0
Missing "esc_class_labels_indices.csv" file

#104 kaiw7 closed 1 year ago
2
Application on ASR

#103 chlorane opened 1 year ago
0
Epoch: [4][160156/161048] training diverged...

#102 xiaoli1996 opened 1 year ago
3
Using pretrained model for embeddings extraction with audio input samples of different durations.

#101 sreenivasaupadhyaya opened 1 year ago
5
About AST for Speech Enhancement

#100 kaiw7 opened 1 year ago
5
ast input audio length

#99 syjunghwang closed 1 year ago
0
How to use the pre-trained model on the AudioSet to extract audio features and save them as npy?

#98 ayameyao opened 1 year ago
2
Colab notebook for inference is only partly usable with CPU

#97 Cycerotki opened 1 year ago
3
Performance issues with recorded voices

#96 milad-s5 opened 1 year ago
5
In my own dataset, why is the Avg precision always 0.5 in each epoch?

#95 1244547821 opened 1 year ago
1
SpeechCommands v2

#94 yunzqq opened 1 year ago
4
Training with Unbatch data!

#93 fallahim opened 1 year ago
0
Dealing with different audio lengths

#92 OhadCohen97 opened 1 year ago
1