Samsung / ONE

On-device Neural Engine
Other
435 stars 157 forks source link

[compiler] introduce sparsifier (converter for sparsity) #3598

Closed glistening closed 4 years ago

glistening commented 4 years ago

I would like to introduce kernel that works for sparse tensor, especially for sparse weight fully connected. For more, see #3597. I will start based on sparsity of tflite, which is already our part of both tflite and circle.

If I decide to use tflite's, it will be format of CSR. See the comment from schema.fbs.

// Sparse tensors.
// We use a modification of the TACO format.
// Reference: http://tensor-compiler.org/kjolstad-oopsla17-tensor-compiler.pdf
//
// To encode a conceptual n-dimensional dense tensor with dims (d0, ..., dn-1),
// potentially with a k-dimensional block (0 <= k <= n) with dims
// (dn, ..., dn+k-1), the format needs to specify:
//   1. In what order to traverse these dimensions. For example, to store a 2-D
//      matrix in row major order, the traversal order would be (d0, d1),
//      whereas to store it in column major order, the traversal order would be
//      (d1, d0). If the 2-D matrix has a 2-D inner block, the traversal order
//      could be (d0, d1, d2, d3).
//   2. How each block dimension in (dn, ..., dn+k-1) maps to the original
//      tensor dimension in (d0, ..., dn-1).
//   3. In the traversal order defined above, the format (dense vs. sparse) and
//      index metadata for each dimension. For a dense dimension, this is just
//      the size of that dimension. For a sparse dimension, it's the same as
//      the compressed index defined in the Compressed Sparse Row (CSR) format.
//      (http://scipy-lectures.org/advanced/scipy_sparse/csr_matrix.html)

// The storage type for a dimension. Currently we support:
//   1. DENSE: each coordinate in this dimension is stored implicitly.
//   2. SPARSE_CSR: only the coordinates with non-zero elements are stored. The
//      compression technique is the same what CSR uses.
// More types like a sparse dimension with a different compression technique
// could be added to the list in the future.

Then, it may be simple at least in python because it could be done with existing implementation. ( As of tensorflow v2.3, tensorflow lite does not provide converter for sparsity. )

However, it is likely that I use similar but somewhat modified version of format. I will write additional information here in comment after I investigate more.

cc @seanshpark, @Samsung/nncc_committers

mhs4670go commented 4 years ago

@glistening I got a network that has sparsity tensor here(which is not sure as of now). Could you let me know how to make kind of these network? I mean the network that has sparsity tensor.

glistening commented 4 years ago

@mhs4670go I've created the model manually at the time of #3689. For your information, which is random sparse model.

While I am working on my target model's sparsity. I've found it can be represented using block sparsity without introducing our own format in circle. Also, I've found some tools from recent tensorflow source. I will try to bring or use some of them.

mhs4670go commented 4 years ago

@glistening I would be glad if you could give me the script that you ran when you generate the network. It's hard to create the the network that contains sparse tensor manually.

glistening commented 4 years ago

I've succeed to sparsify my model using tensorflow's FormatConverter.

I will back to sparsifier tool after finishing onert runtime job (e.g. writing 16x1 neon kernel and comparing kernel candidates, finding a way to solve performance regression from tflite converter 2.2.0 → 2.3.0).

mhs4670go commented 4 years ago

@glistening Could you share the network you generated?

glistening commented 4 years ago

@mhs4670go Sure. I've confirmed the model works. Please see internal model server.

glistening commented 4 years ago

At first, I was thinking to write a standalone tool sparsifier, which sparsify

since sparse tensor can be represented both in tflite and in circle.

However, I've found our all passes (e.g. bcq, fuse_xxx, ...) are in circle2circle even it is not dependent to circle. There is no such tool tflite2tflite. Thus, it seems that sparsify has to (would be good to) go in circle2circle in same way though some tools (gen_golden, tflite_run, ...) does not work in .circle format.

mhs4670go commented 4 years ago

@glistening I could implement the sparsifier that will be introduced to circle2circle if you don't mind:)

FYI, the reason all passes are in circle2circle is because they should be. Even though we do refer to tflite much, the final goal is to be able to accept models in a variety of NN formats. circle2circle will evolve gradually:)

glistening commented 4 years ago

@mhs4670go

@glistening I could implement the sparsifier that will be introduced to circle2circle if you don't mind:)

Sure. I appreciate your help. As I wrote, you can bring format_converter.cc with almost no dependency to tensorflow lite core by removing SparseToDense not used at this moment.

FYI, the reason all passes are in circle2circle is because they shouldn't be. Even though we do refer to tflite much, the final goal is to be able to accept models in a variety of NN formats. circle2circle will evolve gradually:)

circle2circle looks very good. It provides everything I need for a tool (e.g. cmdline arg parsing — I like arser, I would like to replace boost-program-options with arser in nnpackage_run to remove boost dependency — , flatc-stuffs ..) and tensorflow lite already cannot run my target model since it uses some onert own features. Also, in term of SDK release, it is good.

glistening commented 4 years ago

@mhs4670go As I understand, you've finished sparsifier. Can we close this issue and #4394?

mhs4670go commented 4 years ago

@glistening Oh, this issue can be closed. But I will close #4394 a bit later because I gotta add some tests for this.