NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
256 stars 51 forks source link

Unable to reshard non-contiguous inputs. #3071

Open wujingyue opened 5 hours ago

wujingyue commented 5 hours ago

I found this when creating a test for something else. Most existing tests exercise contiguous input tensors, so this issue hasn't been caught so far, I think.

Repro:

Apply https://github.com/NVIDIA/Fuser/pull/3070.

$ _bn && mpirun -np 2 bin/test_multidevice --gtest_filter=MultiDeviceTest.NonContiguous

-np 1 repros too so you can run this on a single-GPU workstation.

wujingyue commented 5 hours ago

This appears to be a limitation in make_resharding_contiguous or insert_reshardings. We can decompose the set into a non-resharding set that makes the tensor contiguous followed by a resharding, all-gather set.