-
Now that we have tensor products of abelian groups in #2021, a nice exercise (and something useful for calculations) would be to show that
$$\mathbb{Z}/a\mathbb{Z} \otimes_{\mathbb{Z}} \mathbb{Z}/b…
-
Hello, could you add an new example of the tensor parallel + fsdp but using a multi-node setup?
Is it possible to do multi-node tensor parallelization with pytorch 2.3? I am trying to use 2 nodes wi…
-
A couple issues with the new tensor parallelism implementation!
1) Tensor Parallelism doesn't appear to respect a lack of flash attention, even via the -nfa flag. It also doesn't document flash att…
-
Dear Developers,
I'm a new Allegro user. I'm just trying to run the simple input shown below
```
*************
# general
root: results/water-tutorial
run_name: water
seed: 42
dataset_seed:…
-
`stablehlo.reshape` conversion is failing if we reshape multiple dimensions simultaneously.
An example stablehlo graph
```
module {
func.func @main(%arg0: tensor) -> tensor {
%0 = stablehlo.re…
-
## Context
Suppose I have the following code (essentially [this test file](https://github.com/google/heir/blob/main/tests/Dialect/LinAlg/Conversions/linalg_to_tensor_ext/float_vector_square_matrix_mat…
-
`ttnn.permute` fails with inner-most dim = 1 when doing transpose (tested with 2D tensors, e.g. `(2, 1)` with permutation `(1, 0)`).
This is blocking the `aten.t.default` to `ttnn.permute` conversi…
-
We know that `Transformer_Engine` has support for FP8 training with `data parallel + tensor parallel + sequence parallel`, https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/a…
-
### What happened?
For the give IR
```mlir
#map = affine_map (d0)>
#map1 = affine_map (0, 0, d2)>
#map2 = affine_map (d0, d1, d2)>
#map3 = affine_map (d2)>
#map4 = affine_map (d0, d1, 0)>
#map5 = af…
-
../xfuser/model_executor/layers/attention_processor.py", line 102, in apply_rotary_emb
[rank1]: out = (x.float() * cos + x_rotated.float() * sin).to(x.dtype)
[rank1]: RuntimeError: The size of t…