bigcode-project / Megatron-LM

Ongoing research training transformer models at scale
Other
376 stars 49 forks source link

Want explanation of the MQA related code #80

Closed hyunwoongko closed 1 year ago

hyunwoongko commented 1 year ago

I confused it. currently I understand all details of the implementation. closing isaue...