xyutao / fscil

Official repository for Few-Shot Class-Incremental Learning (FSCIL)
221 stars 36 forks source link

Some questions about the detail implementation of your great job #7

Closed ScutQi closed 4 years ago

ScutQi commented 4 years ago

Thanks for your great job. I am interested in your work and attempt to implement your work in pytorch but there are several problems when I am doing it. I would appreciate it if you could answer my questions. Q1: When session t=1, how do you initialize the value of centroid vector for each NG node? use k-means or random initialization? Q2: When calculating the anchor loss, you extract the subgraph of G(t), is there a restriction on the subgraph? And G(t) has many subgraphs,which subgraph should be chosen to calculate the anchor loss? Thank you very much!

xyutao commented 4 years ago

Thanks for your attention to our work. Here are my answers: A1: At the initial session, we randomly pick N feature vectors from the training feature vector set to initialize the NG nodes. A2: The anchor loss is calculated on the subgraph whose NG nodes are learned at previous sessions. These nodes are assigned with old classes' label (i.e., c \in \bigcup_{i=1}^{t-1} L(i)) and should be stabilized to avoid forgetting.

You may treat G(t) as a combination of two subgraphs G_o(t) and G_n(t), which stores NG nodes for the old and new classes, respectivelay.

ScutQi commented 4 years ago

Thank you for answering my questions! I have two other questions: Q1: How do you fine-tune the pre-train network? For example, if we use base classes set(60 classes) to train the Quicknet in the session t=1,the number of neurons in the output layer is 60. When session t=2, you use new classes set(5 classes) to finetune the network, do you add 5 new neurons directly to the output layer and use the new set to train the new network? Q2: When session t>1, how do you update the zj and cj of the NG node vj? As I understand it, if zj and cj are updated according to the rule(For every NG node, find the nearset f in the F(t), use f and f‘s label as the pseudo image and label ) in the session t=1,the cj of all NG nodes will become the new classes labels when session t>1 because the D(t-1) is unseen in the sesstion t.Is my understanding right? I would appreciate it if you could reply me。

xyutao commented 4 years ago

A1: Yes, we simply add 5 new neurons directly to the output layer and finetune the entire network with the new class training set. A2: At session t>1, we do not need to update the old NG-node's z_j and c_j, since we aim to stabilize these nodes using anchor loss. We only assign new NG nodes with new z_j and c_j from D(t)

ScutQi commented 4 years ago

Thank for replying me!