The reported parameter count in the document appears to be incorrect. It uses the notation "# Parameters(MB)", presumably indicating the number of parameters and using MB as the unit of measurement. Given that MB typically signifies megabytes and does not align with the standard metric prefix M for 10^6, it leads to confusion. Further, comparing this to the model checkpoints, the discrepancy becomes evident. For instance, the medium model displays a parameter count of 146 million, indicating a significant deviation from the reported figures. Could you please clarify the discrepancy in the data presented in the document?
The reported parameter count in the document appears to be incorrect. It uses the notation "# Parameters(MB)", presumably indicating the number of parameters and using MB as the unit of measurement. Given that MB typically signifies megabytes and does not align with the standard metric prefix M for 10^6, it leads to confusion. Further, comparing this to the model checkpoints, the discrepancy becomes evident. For instance, the medium model displays a parameter count of 146 million, indicating a significant deviation from the reported figures. Could you please clarify the discrepancy in the data presented in the document?