Seamless Switch Between On-the-Fly and Pre-Computed Logits
Description:
This PR introduces a significant enhancement to the DAMTrainer class, allowing for a seamless switch between on-the-fly and pre-computed logits. This flexibility is particularly beneficial for users with GPUs, enabling faster operations and more efficient testing experiments.
Key Changes:
New Parameters:
generate_logits_on_fly: A boolean parameter to control whether logits should be generated on-the-fly or pre-computed.
use_all_logits: A boolean parameter to indicate if all logits should be used. This is only applicable when generate_logits_on_fly is True.
Assertions:
Added an assertion to ensure that use_all_logits cannot be True if generate_logits_on_fly is False.
Logits Computation:
When generate_logits_on_fly is True, logits for each individual model are computed dynamically during training.
When generate_logits_on_fly is False, pre-computed logits are used, and the top-K logits are gathered using the provided indices.
Efficiency Improvements:
By allowing on-the-fly logits generation, users with GPUs can leverage their hardware to perform operations faster.
This flexibility also aids in conducting various testing experiments more efficiently.
Code Updates:
Updated the compute_loss method to handle both on-the-fly and pre-computed logits.
Modified the compute_individual_logit_losses method to accommodate the new parameters and logic.
Benefits:
Performance: Users with GPUs can experience faster training times by generating logits on-the-fly.
Flexibility: The ability to switch between on-the-fly and pre-computed logits provides greater flexibility for different use cases and testing scenarios.
Simplified Workflow: When using on-the-fly logits generation, there is no need to manage and store top-K logits, simplifying the workflow.
Usage:
To use the new functionality, simply set the generate_logits_on_fly and use_all_logits parameters when initializing the DAMTrainer:
Seamless Switch Between On-the-Fly and Pre-Computed Logits
Description:
This PR introduces a significant enhancement to the
DAMTrainer
class, allowing for a seamless switch between on-the-fly and pre-computed logits. This flexibility is particularly beneficial for users with GPUs, enabling faster operations and more efficient testing experiments.Key Changes:
New Parameters:
generate_logits_on_fly
: A boolean parameter to control whether logits should be generated on-the-fly or pre-computed.use_all_logits
: A boolean parameter to indicate if all logits should be used. This is only applicable whengenerate_logits_on_fly
isTrue
.Assertions:
use_all_logits
cannot beTrue
ifgenerate_logits_on_fly
isFalse
.Logits Computation:
generate_logits_on_fly
isTrue
, logits for each individual model are computed dynamically during training.generate_logits_on_fly
isFalse
, pre-computed logits are used, and the top-K logits are gathered using the provided indices.Efficiency Improvements:
Code Updates:
compute_loss
method to handle both on-the-fly and pre-computed logits.compute_individual_logit_losses
method to accommodate the new parameters and logic.Benefits:
Usage: To use the new functionality, simply set the
generate_logits_on_fly
anduse_all_logits
parameters when initializing theDAMTrainer
:This PR enhances the
DAMTrainer
class, making it more versatile and efficient for various training and testing scenarios.