Closed DaToKi closed 4 years ago
Hi Dan!
Thank you for your interest! For the cancer genomics dataset, the input is basically a matrix with rows corresponding to samples (or patients) and columns corresponding to features. Suppose this matrix has 100 rows (corresponding to 100 patients) and 2000 columns (corresponding to 2000 genomic features). Each patient has its own disease subtype (i.e., class). But we only know 10 patients' disease subtypes. The input would be still the entire 100*2000 matrix. But you only backpropagate the error for the 10 patients who have true labels. We are using both labeled and unlabeled data during training. Additionally, if you know similarities among the 100 patients already, you can include that similarity graph (100x100 matrix) as input as well.
For images, the features are oriented in 2D. Because I used fully connected layers for feature transformations in this repository, it probably is not good to handle image inputs. For images, you can replace the fully connected layers with convolutional layers. The rest is similar. Or you can use CNN to extract some latent one-dimensional features for each image first, and then apply kNN attention pooling for a 2D matrix input. Of course, you can still train it end-to-end.
Hi BeautyOfWeb,
first of all, great work and great paper.
I think your work is really interesting.
I have a questions regarding you used data.
You describing the data as cancer genomic datasets.
I would like to know more about the data set/ structure.
Did it contain images which are fed in your AffinityNetwork? That's what most Few-shot learning models does.
Is it possible to see an example of the input data?
Maybe how a class is defined.-
Thanks a lot!
Best regards,
Dan T. Lion