Open SeanNaren opened 2 years ago
We're currently relying on the minGPT/microGPT initialization, however this might need to be modified especially considering we're using ZeRO Stage 3.
Some investigation will be required to understand what the initialization should look like.
We're currently relying on the minGPT/microGPT initialization, however this might need to be modified especially considering we're using ZeRO Stage 3.
Some investigation will be required to understand what the initialization should look like.