neuralmagic / sparsezoo

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
Apache License 2.0
371 stars 25 forks source link

[V2 Analysis] Feature Branch #393

Closed horheynm closed 12 months ago

horheynm commented 1 year ago

Given a stub or path to .onnx, obtain the analysis of individual node and its aggregated values with respect to the weights in the onnx graph.

Analysis is carried out to obtain the counts (num_weights), counts_sparse, bits, bits_quant for

Only for the weights we obtain statistics (mean, mode, historgram,...).

Details:

    SparsityAnalysis:       num_counts, num_counts_sparse
    MemoryAccessAnalysis:   num_mem_access, num_mem_access_sparse
    QuantizationAnalysis:   num_bits, num_bits_quant
    ParameterAnalysis:      contains SparsityAnalysis, MemoryAccessAnalysis,
                             QuantizationAnalysis
    OperationAnalysis:      contains SparsityAnalysis, MemoryAccessAnalysis,
                             QuantizationAnalysis
    DistributionAnalysis:   contains SparsityAnalysis, MemoryAccessAnalysis,
                             QuantizationAnalysis
    NodeAnalysis:           contains ParameterAnalysis, OperationAnalysis,
                             DistributionAnalysis per node_id
    SummaryAnalysis         contains the sum of SparsityAnalysis, MemoryAccessAnalysis,
                             QuantizationAnalysis per grouping
    ModelAnalysis           contains NodeAnalysis, SummaryAnalysis

Usage:

sparsezoo.analyze resnet_v1-50-imagenet-pruned95_uniform_quantized  --save analysis.yaml

Output analysis.yaml looks like:

nodes:
  '1013':
    graph_order: 22
    input:
    - '982'
    - '983'
    - '984'
    - Conv_188.weight_quantized
    - '993'
    - '994'
    - '1011'
    - '1012'
    - Conv_188.bias_quantized
    mem_access:
      name: Conv_188_quant
      quantization:
      - bits: 26309083136.0
        bits_quant: 26309083136
        grouping: tensor
        percent: 1.0
      sparsity:
      - counts: 3288635392
        counts_sparse: 1889828864
        grouping: single
        percent: 0.5746544200665222
      - counts: 822384640
        counts_sparse: 411443200
        grouping: block4
        percent: 0.5003050640634533
    name: Conv_188_quant
    op_type: QLinearConv
...
summaries:
  mem_access:
    quantization:
      tensor:
        bits: 1915033640960.0
        bits_quant: 1881479176192
        percent: 0.982478394086498
    sparsity:
      block4:
        counts: 63077203968
        counts_sparse: 37744545792
        percent: 0.5983864758994132
      single:
        counts: 236233474048
        counts_sparse: 170480788480
        percent: 0.7216622841746814
  ops:
    quantization:
      block4:
        bits: 14327841216.0
        bits_quant: 14944737592
        percent: 1.0430557797717013
      tensor:
        bits: 15010305592.0
        bits_quant: 14944737592
        percent: 0.9956318011250254
    sparsity:
      block4:
        counts: 1116574288
        counts_sparse: 668309040
        percent: 0.5985352225843122
      single:
        counts: 1103240567
        counts_sparse: 766950808
        percent: 0.6951800277663285
  params:
    quantization:
      tensor:
        bits: 58248704.0
        bits_quant: 25480704
        percent: 0.43744671126073464
    sparsity:
      block4:
        counts: 1085824
        counts_sparse: 528453
        percent: 0.4866838456324414
      single:
        counts: 4209088
        counts_sparse: 2389973
        percent: 0.5678125522678547

Testing: Ran on local for local .onnx path and stubs