liangyanshuo / InfLoRA

The official implementation of the CVPR'2024 work Interference-Free Low-Rank Adaptation for Continual Learning
MIT License
48 stars 0 forks source link

About the linear space spanned by the input #6

Open linlany opened 1 month ago

linlany commented 1 month ago

Thanks for your great work!

https://github.com/liangyanshuo/InfLoRA/blob/2c774547d48c40fe5bb8c4a393f7b370e1664148/models/vit_inflora.py#L237

I have a question regarding this part of the code. In previous papers, the linear space spanned by the input was calculated using randomly sampled inputs. However, in this implementation, the linear space is calculated by averaging the product of the input image tokens matrix with its transpose.

Could you please clarify if this method is an approximation of the input linear space? Additionally, are there any references or proofs?

Thank you!

liangyanshuo commented 1 week ago

I am sorry for the late reply.

In a typical linear layer of a Transformer, each input is a token of some input sequence. In other words, each token is an individual input. Therefore, "linear space spanned by the input token" would be more precise than "linear space spanned by the input". In fact, you can examine the code of GPM for CNNs (https://github.com/sahagobinda/GPM), where the "linear space spanned by the input" should be called "linear space spanned by the input patch" since each input is a patch from some image when you transform a convolution layer to a linear layer.

I hope this clarifies your question.