issues
search
YuanGongND
/
ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".
BSD 3-Clause "New" or "Revised" License
365
stars
61
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Use Audioset20K for fine tuning
#38
sunyulin0421
opened
2 months ago
0
experiment setup
#37
yunzqq
opened
3 months ago
0
Use SSAST pretrained model to inference
#36
gavinwwf
opened
5 months ago
0
question: example usage for inference, rather than training
#35
sammlapp
opened
6 months ago
1
SSAST for embedding model
#34
Ironmomo
opened
6 months ago
0
audio file length
#33
fabianbosshard
opened
7 months ago
4
performance and loss of the frame-based model
#32
fabianbosshard
closed
7 months ago
2
Data Loading Logs
#31
fabianbosshard
opened
7 months ago
2
Are the model weights loaded into the ASTModel with stochasticity?
#30
michaelschwob
opened
9 months ago
2
I've been stuck on the first epoch in a tiny model for 8+ hours using 300 5s spectrograms for training. Is this normal?
#29
michaelschwob
closed
9 months ago
3
AttributeError: 'ASTModel' object has no attribute 'unfold'
#28
michaelschwob
opened
9 months ago
8
Librispeech960 Pretrained Model
#27
JDRanpariya
opened
11 months ago
3
For discriminative loss, is the true NCE batch size the number of masked patches?
#26
hillup
opened
1 year ago
2
Which python version are you using?
#25
antoinecomp
opened
1 year ago
1
Host on Huggingface
#24
penguinwang96825
opened
1 year ago
2
python and cuda version
#23
nathanchenseanwalter
opened
1 year ago
3
sed
#22
fuguanyu
opened
1 year ago
1
Extract audio representations for future use
#21
MorenoLaQuatra
opened
1 year ago
13
speech commands v1
#20
yunzqq
opened
1 year ago
3
Model fails to converge on transfer to audio backtesting problem
#19
yangma12
opened
1 year ago
5
Stereo audio
#18
matthiasanderer
closed
1 year ago
1
Bug with input_dim and pure inference
#17
fanOfJava
opened
1 year ago
11
I want to use frame-level ssast just for frame-level audio token extraction
#16
9B8DY6
opened
1 year ago
3
How to convert fbank features back to audio ?
#15
linmou
closed
1 year ago
3
Master
#14
AviadDahan
closed
2 years ago
0
about nce loss cal
#13
liyunlongaaa
closed
2 years ago
3
A question about ESC's AC
#12
liyunlongaaa
closed
2 years ago
2
Target length
#11
kremHabashy
opened
2 years ago
2
fine-tuning sample rate
#10
skirdey
closed
2 years ago
1
Embeddings without fine tuning
#9
HugoBothaMD
closed
1 year ago
5
Create LICENSE
#8
YuanGongND
closed
2 years ago
0
The model loaded is not from a torch.nn.Dataparallel object
#7
kremHabashy
closed
2 years ago
19
Half precision spectograms - Mixed precision training
#6
treasan
opened
2 years ago
1
Loss curves
#5
treasan
opened
2 years ago
4
Trouble with pure inference
#4
beyondbeneath
opened
2 years ago
14
Dataset mean / stdev normalization
#3
MikeKras
opened
2 years ago
20
Potential bug/sub-optimal implementation of final output activation and losses
#2
harrygcoppock
opened
2 years ago
2
ssast pretrained solely on Audioset-2M
#1
Moadab-AI
closed
2 years ago
2