Open eyonland opened 6 months ago
Operations do not put the tensor back on device after a fallback : The merging of ttnn operations without the use of run_with_autoformat has hit some road blockers. The underlying issue is that ops like permute will fallback to running on host. They will take their tensor on device, pull it on host for the fallback, and then just leave it on host. This underlying issue with ops that are not able to perform entirely on device is being masked by tt_lib ops with run_with_autoformat. Getting rid of this means fundamentally that we have to go in and at least put the tensors back on device.
@eyonland what support is provided to handle odd shapes on device?
For example, outer() uses tensors of [1,1,M,1] and [1,1,1,N]. These tensors cannot be handled on device and need to be transformed into [1,1,M,32] and [1,1,32,N]. However, there is no functionality to handle this case properly outside of run_with_autoformat that I know of.
Problem: We no longer want to support automatically formatting tensors (ie, changing there layouts or moving them off/on device implicitly). The user needs to be aware of the cost of doing these operations and at the moment this cost is being hidden. ttnn offers a to_layout C++ method that can be leveraged to do whatever formatting is necessary. The ask here is to identify any models that are currently using the functions below and will break if the following functions below no longer have this feature. These changes will require the models to import ttnn and use ttnn.layout wherever necessary. Subsequently, the following operations need to be update to nolonger use run_with_autoformat and run_without_autoformat.