Closed Simsso closed 5 years ago
abs_identity_mapping_threshold
hyperparameter by creating a plot "threshold vs. validation accuracy"Suggesting the following setup:
act5_block3
(for correctly classified samples) alongside with the labels. That yields a list of tuples (label, activation), where activation is a vector of size 4096. The length of the list is 74,246.act5_block3
.Next steps: hyperparameter tuning and positioning of the layer
LESCI (thanks for that awesome naming suggestion @FlorianPfisterer) stands for large embedding space constant initialization. This work item involves the development of a LESCI layer (based on a VQ-layer with cosine similarity lookup #63) and its empirical evaluation.