Closed rohany closed 3 years ago
One option is to generate a corresponding tensor for each input tensor by shifting one (or more) coordinates of each nonzero. In the case of SuiteSparse matrices, for instance, you can just increment the column coordinate of each nonzero by one. This way, the generated tensor will have similar sparsity as the original tensor while also (likely) have some overlap of nonzeros. You can control the degree of overlap by changing which coordinates are shifted and how much the coordinates are shifted by. You can then make sure that the parameters you choose give a reasonable degree of overlap by inspecting the output for tensor addition (i.e., you'd want to make sure that the number of nonzeros in the output is more than the number of nonzeros in the original input but less than twice the number of nonzeros in the input).
that's a great idea, I'll try it out
I need to check that the overlaps look reasonable
If i'm unable to get good overlap for the higher dimensional tensors, I can change the shifting to only shift some of the coordinates (for example, only shift even coordinates).
Shifting even coordinates seems to give decent overlap on the higher order tensors as well. Shifting all of the coordinates appears to be fine on order 2 tensors, so I'll keep it like that for those tensors. I would like to shift every other coordinate, but I don't think I can guarantee that python and taco iterate through tensors in the same order.
We're planning to use FROSTT and maybe suitesparse tensors in the benchmarks for the ufunc operations. Since ufunc's are generally pointwise binary operations, a question we had is what should we use as the "other" tensor in the ufunc operation, if one of the tensors comes from these data sets? @weiya711 and I had a thought to use a uniformly random tensor of the same size as the other operand. 1) I don't know if this is a reasonable thing to do 2) it seems like trying to create uniformly random tensors is pretty expensive (for large tensors). Trying to create a random tensor to match the nips (one of the smaller tensors in the data set) tensor in frost (2482x2862x14036x17) has been going for 10 minutes with no signs of stopping.
Thoughts @stephenchouca ?