nod-ai / SHARK-TestSuite

Temporary home of a test suite we are evaluating
Apache License 2.0
5 stars 35 forks source link

Add ability to gather and analyse some model metadata #376

Closed zjgarvey closed 1 month ago

zjgarvey commented 1 month ago

Usage:

When running python run.py <other-args> --get-metadata, this will save a dictionary with the model size and op frequencies to the log directory.

After a run, you can use python utils/find_duplicate_models.py to save or print a json dump of redundant models.

Options:

Sample:

I saved the tests below to a file called sample.txt.

add_test
model--bart-base-booksum--KamilAin
model--bart-base-cnn--ainize
model--bart-base-few-shot-k-1024-finetuned-squad-seed-2--anas-awadalla
model--bart-base-few-shot-k-1024-finetuned-squad-seed-4--anas-awadalla

With a clean test-run directory, I ran

python run.py --testsfile=sample.txt --stages "setup" --get-metadata

The result of running

python utils/find_duplicate_models.py -s

was:

[
    [
        "model--bart-base-booksum--KamilAin",
        "model--bart-base-cnn--ainize"
    ],
    [
        "model--bart-base-few-shot-k-1024-finetuned-squad-seed-4--anas-awadalla",
        "model--bart-base-few-shot-k-1024-finetuned-squad-seed-2--anas-awadalla"
    ]
]

and without the -s arg, it includes the metadata for each grouping:

[
    {
        "models": [
            "model--bart-base-booksum--KamilAin",
            "model--bart-base-cnn--ainize"
        ],
        "shared_metadata": {
            "model_size": 712772272,
            "op_frequency": {
                "Add": 227,
                "Cast": 13,
                "Concat": 188,
                "Constant": 886,
                "ConstantOfShape": 6,
                "Div": 44,
                "Equal": 5,
                "Erf": 12,
                "Expand": 5,
                "Gather": 64,
                "Less": 1,
                "MatMul": 133,
                "Mul": 99,
                "Pow": 32,
                "Range": 3,
                "ReduceMean": 64,
                "Reshape": 187,
                "Shape": 67,
                "Slice": 2,
                "Softmax": 18,
                "Sqrt": 32,
                "Squeeze": 2,
                "Sub": 35,
                "Transpose": 90,
                "Unsqueeze": 325,
                "Where": 8
            }
        }
    },
    {
        "models": [
            "model--bart-base-few-shot-k-1024-finetuned-squad-seed-4--anas-awadalla",
            "model--bart-base-few-shot-k-1024-finetuned-squad-seed-2--anas-awadalla"
        ],
        "shared_metadata": {
            "model_size": 558176646,
            "op_frequency": {
                "Add": 229,
                "Cast": 17,
                "Concat": 193,
                "Constant": 937,
                "ConstantOfShape": 12,
                "Div": 44,
                "Equal": 10,
                "Erf": 12,
                "Expand": 11,
                "Gather": 70,
                "Less": 1,
                "MatMul": 133,
                "Mul": 103,
                "Pow": 32,
                "Range": 6,
                "ReduceMean": 64,
                "Reshape": 191,
                "ScatterND": 2,
                "Shape": 83,
                "Slice": 7,
                "Softmax": 18,
                "Split": 1,
                "Sqrt": 32,
                "Squeeze": 4,
                "Sub": 35,
                "Transpose": 90,
                "Unsqueeze": 333,
                "Where": 13
            }
        }
    }
]
saienduri commented 1 month ago

Thanks