tensor-compiler / taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs
http://tensor-compiler.org
Other
1.26k stars 189 forks source link

Output transformed C++ kernel code from the TACO C++ source code #554

Closed yiansu closed 1 year ago

yiansu commented 1 year ago

Hi I'm new to taco and very interested in the work. A question I have is there a way to generate the kernel code compiled by the taco compiler from the source file written with the TACO C++ library?

For instance, from the SpMV example, using the g++ ... command will produce the final binary directly w/o dumping the middle code.

However, using the taco cmd will only generate the kernel using index expression, omitting the logic of matrix loading, vector initialization, main function, etc. taco "y(i) = 42.0 * (A(i,j) * x(j)) + 33.0 * z(i)" -f=y:d -f=A:ds -f=x:d -f=z:d returns

int compute(taco_tensor_t *y, taco_tensor_t *A, taco_tensor_t *x, taco_tensor_t *z) {
  int y1_dimension = (int)(y->dimensions[0]);
  double* restrict y_vals = (double*)(y->vals);
  int A1_dimension = (int)(A->dimensions[0]);
  int* restrict A2_pos = (int*)(A->indices[1][0]);
  int* restrict A2_crd = (int*)(A->indices[1][1]);
  double* restrict A_vals = (double*)(A->vals);
  int x1_dimension = (int)(x->dimensions[0]);
  double* restrict x_vals = (double*)(x->vals);
  int z1_dimension = (int)(z->dimensions[0]);
  double* restrict z_vals = (double*)(z->vals);

  #pragma omp parallel for schedule(runtime)
  for (int32_t i = 0; i < z1_dimension; i++) {
    double tj_val = 0.0;
    for (int32_t jA = A2_pos[i]; jA < A2_pos[(i + 1)]; jA++) {
      int32_t j = A2_crd[jA];
      tj_val += 42.000000000000000 * (A_vals[jA] * x_vals[j]);
    }
    y_vals[i] = tj_val + 33.000000000000000 * z_vals[i];
  }
  return 0;
}

Is there an option for me to explicitly view the full middle C++ code compiled from the source?

yiansu commented 1 year ago

I've figured it out on my end on this issue.