What does this PR do ?

Adds support for float values for val_check_interval for SFT. It also adds support for floats/ints for limit_train_batches to SFT and DPO, as per the usage in PTL

This was requested by @Kipok

Changelog

Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

You can potentially add a usage example below

val_check_interval = 0.25   # means you will run validation 4 times per epoch
val_check_interval = 100    # means you will run validation every 100 steps of training
limit_train_batches = 0.5   # you will only use 50% of your training data per epoch
limit_train_batches = 100  # you will only consume 100 steps of your train dataloader per epoch

All possibilities can be used for SFT, DPO, and SPIN

Before your PR is "Ready for review"

Pre checks:

[X] Make sure you read and followed Contributor guidelines
[ ] Did you write any new necessary tests?
[ ] Did you add or update any necessary documentation? Make sure to also update the NeMo Framework User Guide which contains the tutorials

Checklist when contributing a new algorithm

[ ] Does the trainer resume and restore model state all states?
[ ] Does the trainer support all parallelism techniques(PP, TP, DP)?
[ ] Does the trainer support max_steps=-1 and validation?
[ ] Does the trainer only call APIs defined in alignable_interface.py?
[ ] Does the trainer have proper logging?

Additional Information

Related to # (issue)

NVIDIA / NeMo-Aligner

Added support for float values for val_check_interval to SFT #202