tensor-compiler / taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
http://tensor-compiler.org
Other
1.22k stars 185 forks source link

Can not generate omp parallel code for sparse tensors #561

Open MaxwellF1 opened 2 months ago

MaxwellF1 commented 2 months ago

I'm using the taco C++ API (built with OpenMP ON) to compute the contraction between two sparse tensors using this code:

Format csf({Sparse, Sparse, Sparse, Sparse});
Tensor<double> X = read("x.tns", csf);
Tensor<double> Y = read("y.tns", csf);
Tensor<double> Z({X.getDimension(0),X.getDimension(1),Y.getDimension(0),Y.getDimension(1)}, csf);

IndexVar i, j, k, l, m, n;
Z(i,j,m,n) = X(i,j,k,l) * Y(m,n,k,l); 

Z.compile();
Z.printComputeIR(std::cout);
Z.assemble();
Z.compute();

And I found the printed code is not omp parallel in the loop. However, I test the spmv computation and it is generated with omp parallel.

rohany commented 2 months ago

There won't be any parallel loops generated by default with this setup as there is no outer dense loop to parallelize over. Perhaps if you made a different format {Dense, Sparse, Sparse, Sparse} you would see parallel loops generated.