Modalities / modalities

Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
MIT License
63 stars 8 forks source link

Replace temporary workaround in MFU computation #238

Open flxst opened 2 months ago

flxst commented 2 months ago

In #215, the function get_theoretical_flops_per_token was created using a temporary workaround to ensure that the computation is conducted only in the presence of GPUs. This is because the underlying function get_total_number_of_trainable_parameters requires GPUs. However, in principle, get_theoretical_flops_per_token depends only on the model architecture.

The if statements could be removed if mocking of the get_total_number_of_trainable_parameters function was used for CPU tests.