Phi3 profiling - Githubissues

Analysis

PyTorch ONNX Conversion Analysis

Model Information

The model has 3821079552 parameters and 1536 buffers (non-trainable parameters). Number of parameters per dtype:

defaultdict(<class 'int'>, {torch.float32: 3821079552})

Number of buffers per dtype:

defaultdict(<class 'int'>, {torch.float32: 1536})

Inputs:

arg227_1: TensorMetadata(shape=torch.Size([2, 16]), dtype=torch.int64, requires_grad=False, stride=(16, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg228_1: TensorMetadata(shape=torch.Size([2, 32]), dtype=torch.int64, requires_grad=False, stride=(32, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg229_1: TensorMetadata(shape=torch.Size([2, 16]), dtype=torch.int64, requires_grad=False, stride=(16, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg230_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg231_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg232_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg233_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg234_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg235_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg236_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg237_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg238_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg239_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg240_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg241_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg242_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg243_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg244_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg245_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg246_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg247_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg248_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg249_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg250_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg251_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg252_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg253_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg254_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg255_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg256_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg257_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg258_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg259_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg260_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg261_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg262_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg263_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg264_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg265_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg266_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg267_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg268_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg269_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg270_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg271_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg272_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg273_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg274_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg275_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg276_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg277_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg278_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg279_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg280_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg281_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg282_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg283_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg284_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg285_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg286_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg287_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg288_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg289_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg290_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg291_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg292_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
arg293_1: TensorMetadata(shape=torch.Size([2, 32, 16, 96]), dtype=torch.float32, requires_grad=False, stride=(49152, 1536, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})

Outputs:

view_643: TensorMetadata(shape=torch.Size([2, 16, 32064]), dtype=torch.float32, requires_grad=False, stride=(513024, 32064, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_4: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_5: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_9: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_10: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_14: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_15: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_19: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_20: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_24: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_25: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_29: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_30: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_34: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_35: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_39: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_40: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_44: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_45: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_49: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_50: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_54: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_55: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_59: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_60: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_64: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_65: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_69: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_70: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_74: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_75: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_79: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_80: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_84: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_85: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_89: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_90: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_94: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_95: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_99: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_100: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_104: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_105: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_109: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_110: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_114: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_115: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_119: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_120: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_124: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_125: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_129: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_130: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_134: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_135: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_139: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_140: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_144: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_145: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_149: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_150: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_154: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_155: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_159: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})
cat_160: TensorMetadata(shape=torch.Size([2, 32, 32, 96]), dtype=torch.float32, requires_grad=False, stride=(98304, 3072, 96, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})

The FX graph has 3697 nodes in total. Number of FX nodes per op:

placeholder: 294
call_function: 3402
output: 1

Of the call_function nodes, the counts of operators used are:

aten.view.default: 644
aten.slice.Tensor: 324
aten.mul.Tensor: 290
aten.add.Tensor: 226
aten.expand.default: 226
aten.transpose.int: 192
aten.unsqueeze.default: 164
aten.cat.default: 161
aten.clone.default: 160
aten.t.default: 129
aten.mm.default: 129
aten.bmm.default: 96
aten.scalar_tensor.default: 65
aten.pow.Tensor_Tensor: 65
aten.mean.dim: 65
aten.rsqrt.default: 65
aten.neg.default: 64
<built-in function getitem>: 64
aten._to_copy.default: 36
aten.cos.default: 32
aten.sin.default: 32
aten._unsafe_view.default: 32
aten.div.Tensor: 32
aten._softmax.default: 32
aten.split.Tensor: 32
aten.silu.default: 32
aten.masked_fill.Scalar: 4
aten.rsub.Scalar: 2
aten.embedding.default: 1
aten.full.default: 1
aten.arange.default: 1
aten.lt.Tensor: 1
aten.zeros.default: 1
aten.ones_like.default: 1
aten.triu.default: 1

ONNX Conversion Information

All operators in the model have registered ONNX decompositions.

Profiling result


  _     ._   __/__   _ _  _  _ _/_   Recorded: 10:56:07  Samples:  17276
 /_//_/// /_\ / //_// / //_'/ //     Duration: 18.271    CPU time: 18.110
/   _/                      v4.6.2

Program: /Users/justinc/Documents/GitHub/torch-onnx/venv/bin/optimum-cli export onnx --model microsoft/Phi-3-mini-4k-instruct phi3

18.270 export  torch_onnx/_core.py:796
├─ 12.038 export  torch/export/__init__.py:73
│     [277 frames hidden]  torch, contextlib, dis, importlib, as...
└─ 6.231 exported_program_to_ir  torch_onnx/_core.py:618
   ├─ 3.705 wrapper  torch/export/exported_program.py:80
   │     [64 frames hidden]  torch, <string>
   ├─ 1.755 _add_nodes  torch_onnx/_core.py:486
   │  └─ 1.742 _handle_call_function_node_with_lowering  torch_onnx/_core.py:356
   │     ├─ 1.191 TracedOnnxFunction.__call__  ../../onnxscript/onnxscript/values.py:581
   │     │  ├─ 0.625 SymbolicTensor.aten_slice  ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:7524
   │     │  │  ├─ 0.230 Opset18.Cast  ../../onnxscript/onnxscript/onnx_opset/_impl/opset13.py:241
   │     │  │  │  └─ 0.224 Op.__call__  ../../onnxscript/onnxscript/values.py:291
   │     │  │  │     └─ 0.220 OpRecorder.eval  torch_onnx/_building.py:390
   │     │  │  └─ 0.215 Opset18.Constant  ../../onnxscript/onnxscript/onnx_opset/_impl/opset13.py:408
   │     │  │     └─ 0.212 Op.__call__  ../../onnxscript/onnxscript/values.py:291
   │     │  │        └─ 0.211 OpRecorder.eval  torch_onnx/_building.py:390
   │     │  └─ 0.258 SymbolicTensor.aten_view  ../../onnxscript/onnxscript/function_libs/torch_lib/ops/core.py:8740
   │     └─ 0.198 _set_node_metadata  torch_onnx/_core.py:226
   ├─ 0.454 insert_type_promotion_nodes  torch_onnx/_fx_passes.py:13
   │  └─ 0.419 wrapper  torch/onnx/_internal/diagnostics/infra/decorator.py:71
   │        [9 frames hidden]  torch
   └─ 0.299 OnnxRegistry.from_torchlib  torch_onnx/_registration.py:114

justinchuby / torch-onnx

Phi3 profiling #74

PyTorch ONNX Conversion Report

Profiling result

Analysis

Model Information

ONNX Conversion Information

Profiling result