ByungKwanLee / TroL

[EMNLP 2024] Official PyTorch implementation code for realizing the technical part of Traversal of Layers (TroL) presenting new propagation operation to get super vision language performances.
86 stars 1 forks source link

The values of the trol_gating weight are the same in each layer. #2

Closed DreamMr closed 5 months ago

DreamMr commented 5 months ago

https://github.com/ByungKwanLee/TroL/blob/7a71dfb9acf1f5339ce29c30f9e2b07faa7757ca/trol/arch_internlm2/modeling_internlm2.py#L875

Hi~, I found that the values of the trol_gating weight are the same in 'trol_gating.pt'. Is that right? image

ByungKwanLee commented 5 months ago

Originally, I was not intending its sharing weight, but the trained weight, as you know, was saved in sharing them. Therefore, the result is just the way they are but it can be further updated in the future for the further improved version. Thank you for examining it!

DreamMr commented 5 months ago

Nice work