YuanGongND ssast issues

YuanGongND / ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

BSD 3-Clause "New" or "Revised" License

365 stars 61 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Use Audioset20K for fine tuning

#38 sunyulin0421 opened 2 months ago
0
experiment setup

#37 yunzqq opened 3 months ago
0
Use SSAST pretrained model to inference

#36 gavinwwf opened 5 months ago
0
question: example usage for inference, rather than training

#35 sammlapp opened 6 months ago
1
SSAST for embedding model

#34 Ironmomo opened 6 months ago
0
audio file length

#33 fabianbosshard opened 7 months ago
4
performance and loss of the frame-based model

#32 fabianbosshard closed 7 months ago
2
Data Loading Logs

#31 fabianbosshard opened 7 months ago
2
Are the model weights loaded into the ASTModel with stochasticity?

#30 michaelschwob opened 9 months ago
2
I've been stuck on the first epoch in a tiny model for 8+ hours using 300 5s spectrograms for training. Is this normal?

#29 michaelschwob closed 9 months ago
3
AttributeError: 'ASTModel' object has no attribute 'unfold'

#28 michaelschwob opened 9 months ago
8
Librispeech960 Pretrained Model

#27 JDRanpariya opened 11 months ago
3
For discriminative loss, is the true NCE batch size the number of masked patches?

#26 hillup opened 1 year ago
2
Which python version are you using?

#25 antoinecomp opened 1 year ago
1
Host on Huggingface

#24 penguinwang96825 opened 1 year ago
2
python and cuda version

#23 nathanchenseanwalter opened 1 year ago
3
sed

#22 fuguanyu opened 1 year ago
1
Extract audio representations for future use

#21 MorenoLaQuatra opened 1 year ago
13
speech commands v1

#20 yunzqq opened 1 year ago
3
Model fails to converge on transfer to audio backtesting problem

#19 yangma12 opened 1 year ago
5
Stereo audio

#18 matthiasanderer closed 1 year ago
1
Bug with input_dim and pure inference

#17 fanOfJava opened 1 year ago
11
I want to use frame-level ssast just for frame-level audio token extraction

#16 9B8DY6 opened 1 year ago
3
How to convert fbank features back to audio ?

#15 linmou closed 1 year ago
3
Master

#14 AviadDahan closed 2 years ago
0
about nce loss cal

#13 liyunlongaaa closed 2 years ago
3
A question about ESC's AC

#12 liyunlongaaa closed 2 years ago
2
Target length

#11 kremHabashy opened 2 years ago
2
fine-tuning sample rate

#10 skirdey closed 2 years ago
1
Embeddings without fine tuning

#9 HugoBothaMD closed 1 year ago
5
Create LICENSE

#8 YuanGongND closed 2 years ago
0
The model loaded is not from a torch.nn.Dataparallel object

#7 kremHabashy closed 2 years ago
19
Half precision spectograms - Mixed precision training

#6 treasan opened 2 years ago
1
Loss curves

#5 treasan opened 2 years ago
4
Trouble with pure inference

#4 beyondbeneath opened 2 years ago
14
Dataset mean / stdev normalization

#3 MikeKras opened 2 years ago
20
Potential bug/sub-optimal implementation of final output activation and losses

#2 harrygcoppock opened 2 years ago
2
ssast pretrained solely on Audioset-2M

#1 Moadab-AI closed 2 years ago
2