NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.34k stars 1.39k forks source link

[Test][Transformer] Pre-parse container version #1673

Closed Aidyn-A closed 1 year ago

Aidyn-A commented 1 year ago

The test tests/L0/run_transformer/test_pipeline_parallel_fwd_bwd.py is failing in PyTorch container with non-"YY.MM" format of NVIDIA_PYTORCH_VERSION. This PR adds a workaround by preprocessing the version string.

cc @crcrpar

Aidyn-A commented 1 year ago

removing the check can be an alternative as well

I am kinda paranoid about backward compatibility, but I guess it is now okay to completely remove this check. Should we remove it just in the test or in the transformer as well?