Open dlwh opened 5 months ago
I'm not sure we should do zeros_like (the sharding of the input isn't accessible in general, so we can either do nothing or auto_shard), but it seems like zeros etc should shard per the axis mapping.
I'm not sure we should do zeros_like (the sharding of the input isn't accessible in general, so we can either do nothing or auto_shard), but it seems like zeros etc should shard per the axis mapping.