SVD Initialisation - Githubissues

syp2ysy / SVF

[NeurIPS 2022] Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning

MIT License

67 stars 6 forks source link

SVD Initialisation #17

Open HichTala opened 4 months ago

HichTala commented 4 months ago

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

DUT-CSJ commented 3 months ago

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

I have the same problem with Linear module. When employing torch.svd() to Linear weight, S could be nan. Do you solve this problem?

DUT-CSJ commented 3 months ago

Hi,

First of all, thank you for your work, it's really appreciated.

I have a question about replacing full rank with low rank in this section. https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L60C1-L63C8

While for convolution modules there is no problem, https://github.com/syp2ysy/SVF/blob/8d42e341fcd93658e30e85fc50c86ec04a1a9850/svf.py#L117C1-L125C63

It seems to me that for Linear modules the weights of matrices U, S and V are randomly initialized. I understood that this wasn't the case in the paper (I may be wrong) but I understood that the module initialization had to be based on an SVD on the pre-trained weights.

Thank you in advance for your clarification.

I seem to solve this problem through fixing the code.

HichTala commented 3 months ago

Hi, I solved the problem too by initializing U, S, V matrix using the SVD instead of the random initialization of the code.