64-bit indexing Adam - Githubissues

Issues:

Incomplete Testing in testLargeTensor Method: Location: tests/L0/run_optimizers/test_adam.py. Description: The method aimed to compare the correctness of FusedAdam by applying the step() function to two large tensors with same gradient(another one using torch.optim.adam). However, the test only invoked step() on the first optimizer.
Type Overflow in TensorListMetadata: Location: csrc/multi_tensor_apply.cuh Description: The data structures sizes[] and block_to_chunk[] within TensorListMetadata were statically typed as integers. This led to overflow when managing tensors with lengths surpassing INT_MAX.

Solution:

Added an optimizer step to optimizer2 within testLargeTensor of tests/L0/run_optimizers/test_adam.py for accurate testing of large tensor operations.
Refactored TensorListMetadata in csrc/multi_tensor_apply.cuh: A modification of the template to accommodate either int32_t or int64_t sizes, ensuring backward compatibility with the existing declaration method (TensorListMetadata)
Added multi_tensor_apply64 to specifically handle int64_t size indexing for Adam. This new function mirrors the functionality of multi_tensor_apply but incorporates checks against depth_to_max_tensors64 for enhanced large tensor support.
Modifications have been made in csrc/multi_tensor_adam.cu to invoke multi_tensor_apply64 and utilize the specified data structure according to index_t.

NVIDIA / apex