metagraph-dev / mlir-graphblas

MLIR tools and dialect for GraphBLAS
https://mlir-graphblas.readthedocs.io/en/latest/
Apache License 2.0
16 stars 6 forks source link

Cannot properly represent i64 sparse tensor via mlir_graphblas.sparse_utils.MLIRSparseTensor #178

Closed paul-tqh-nguyen closed 3 years ago

paul-tqh-nguyen commented 3 years ago
This script (click to expand) shows that we can't represent a sparse tensor of 64-bit integers via mlir_graphblas.sparse_utils.MLIRSparseTensor.
```python import numpy as np from mlir_graphblas.tests.jit_engine_test_utils import * dense_input_tensor = np.array( [0, 1, 2, 0, 0, 3, 0, 0], dtype=np.int64, ) sparse_tensor = sparsify_array(dense_input_tensor, [True]) print() print(f"sparse_tensor.pointers[0] {repr(sparse_tensor.pointers[0])}") print(f"sparse_tensor.indices[0] {repr(sparse_tensor.indices[0])}") print(f"sparse_tensor.values {repr(sparse_tensor.values)}") print(f"sparse_tensor.get_dimsize(0) {repr(sparse_tensor.get_dimsize(0))}") print() ``` The result of the script's execution is: ``` (mlirgraphblas) pnguyen@CONDA-0584:/Users/pnguyen/code/mlir-graphblas$ python3 test.py sparse_tensor.pointers[0] array([0, 3], dtype=uint64) sparse_tensor.indices[0] array([1, 2, 5], dtype=uint64) sparse_tensor.values array([ 1, 8589934592, 2]) sparse_tensor.get_dimsize(0) 8 (mlirgraphblas) pnguyen@CONDA-0584:/Users/pnguyen/code/mlir-graphblas$ ```

This is making it difficult to support argmin and argmax in graphblas.reduce_to_vector and graphblas.reduce_to_scalar as those need to return tensors of i64 (since it appears that sparse tensors with elements of type index are not yet supported in MLIR).

Another example (click to expand).
```python import numpy as np from mlir_graphblas import MlirJitEngine from mlir_graphblas.tests.jit_engine_test_utils import * engine = MlirJitEngine() mlir_text = """ #SparseVec64 = #sparse_tensor.encoding<{ dimLevelType = [ "compressed" ], pointerBitWidth = 64, indexBitWidth = 64 }> module { builtin.func @main(%vector: tensor) -> () { %vector_values = sparse_tensor.values %vector : tensor to memref %c0 = constant 0 : index %c1 = constant 1 : index %c2 = constant 2 : index %c1_i64 = constant 1 : i64 %c2_i64 = constant 2 : i64 %c3_i64 = constant 3 : i64 memref.store %c1_i64, %vector_values[%c0] : memref memref.store %c2_i64, %vector_values[%c1] : memref memref.store %c3_i64, %vector_values[%c2] : memref return } } """ engine.add(mlir_text, GRAPHBLAS_PASSES) dense_input_tensor = np.array( [7, 8, 9], dtype=np.int64, ) sparse_tensor = sparsify_array(dense_input_tensor, [True]) print() print("Before:") print() print(f"sparse_tensor.pointers[0] {repr(sparse_tensor.pointers[0])}") print(f"sparse_tensor.indices[0] {repr(sparse_tensor.indices[0])}") print(f"sparse_tensor.values {repr(sparse_tensor.values)}") print(f"sparse_tensor.get_dimsize(0) {repr(sparse_tensor.get_dimsize(0))}") print() engine.main(sparse_tensor) dense_ans = densify_vector(sparse_tensor) np.set_printoptions(linewidth=float("inf")) print() print("After:") print() print(f"sparse_tensor.pointers[0] {repr(sparse_tensor.pointers[0])}") print(f"sparse_tensor.indices[0] {repr(sparse_tensor.indices[0])}") print(f"sparse_tensor.values {repr(sparse_tensor.values)}") print(f"sparse_tensor.get_dimsize(0) {repr(sparse_tensor.get_dimsize(0))}") print() ``` Execution: ``` (mlirgraphblas) pnguyen@CONDA-0584:/Users/pnguyen/code/mlir-graphblas$ python3 test.py Before: sparse_tensor.pointers[0] array([0, 3], dtype=uint64) sparse_tensor.indices[0] array([0, 1, 2], dtype=uint64) sparse_tensor.values array([ 7, 34359738368, 8]) sparse_tensor.get_dimsize(0) 3 After: sparse_tensor.pointers[0] array([0, 3], dtype=uint64) sparse_tensor.indices[0] array([0, 1, 2], dtype=uint64) sparse_tensor.values array([ 1, 8589934592, 2]) sparse_tensor.get_dimsize(0) 3 (mlirgraphblas) pnguyen@CONDA-0584:/Users/pnguyen/code/mlir-graphblas$ ```

Note that the weird values aren't arbitrary garbage values in memory. Running the above examples gets the same result every time.

CC: @eriknw @jim22k