Closed itsmemala closed 1 year ago
Hello,
Thanks for thoroughly examining the code :)
Your understanding is correct. To simplify the implementation, we chose to create a fixed number of capsules instead of dynamically creating them. We don't anticipate this to have a significant impact.
For the decision maker's tensor, please note that self.tsv is the lower triangular portion of an all 1's matrix, designed to mask out future capsules (see https://github.com/ZixuanKe/PyContinual/blob/main/src/networks/base/adapters.py#L259 and https://github.com/ZixuanKe/PyContinual/blob/main/src/networks/base/adapters.py#L318).
Thanks again for bringing this to our attention.
That clarifies! Thanks :)
Hello,
If I understand correctly, lines 277-337 in 'networks/base/adapters.py' implement the similarity estimator and transfer routing mechanisms from the CTR paper ('Achieving forgetting prevention and knowledge transfer in continual learning') - is this right?
The paper mentions that a TK-layer capsule is added for each new task, so I would think that at the beginning there is only one TK capsule and others are added progressively. However, per line 287, it appears that the number of capsules is fixed (= number of tasks). Also, printing the values for _decisionmaker tensor (which corresponds to output of TR component from the paper) after the for loop shows values such as [1,1,0,1,1,1] (when training on the first of 6 tasks) - this doesn't make sense as there should not be any connectivity between current and future capsules when training on the first task?
Could you please explain and correct me if I'm wrong?