Training of Q-former last layer and CLORI module

DCDmllm / Cheetah

BSD 3-Clause "New" or "Revised" License

356 stars 35 forks source link

Training of Q-former last layer and CLORI module #14

Open xuliwalker opened 1 year ago

xuliwalker commented 1 year ago

Thanks for providing such a great work! I have a question about your training process: did you train the last linear projection layer of Q-former and the proposed CLORI module separately or together? From your paper, it seems like these two parts are trained separately.