microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14k stars 1.81k forks source link

assert len(graph.nodes) == len(graph_check.nodes) #5613

Closed HuYue233 closed 1 year ago

HuYue233 commented 1 year ago

Describe the bug: I compress my model with L1NormPruner and speedup it, then an error occurs. How can I solve this problem?

This is error: 屏幕截图 2023-06-19 091727

I checked it, but I don't know how to solve this problem: QQ图片20230619095651

This is the code for the pruning part of my project:

device = torch.device("cpu")
inputs = torch.randn((1, 3, 768, 768))
model_path = 'weights/yolov3_cqxdq_total_300_300.pt'
pruner_model_path = 'weights/yolov3_cqxdq_pruner_weights.pth'
config_list = [{'sparsity': 0.6, 'op_types': ['Conv2d']}]

model = attempt_load(model_path, map_location=device)  # load FP32 model

from nni.compression.pytorch.pruning import L1NormPruner

pruner = L1NormPruner(model, config_list)
_, masks = pruner.compress()
for name, mask in masks.items():
    print(name, ' sparsity : ', '{:.2}'.format(mask['weight'].sum() / mask['weight'].numel()))
pruner._unwrap_model()

from nni.compression.pytorch.speedup.v2 import ModelSpeedup

m_speedup = ModelSpeedup(model, inputs, masks, device, batch_size=2)
m_speedup.speedup_model()

This is the structure and forward of my model: https://github.com/ultralytics/yolov3 Due to the high version of Pytorch, I have made modifications to this:Originally posted by @EdwardAndersonMcDermott in https://github.com/ultralytics/yolov5/issues/6948#issuecomment-1075528897 image And I deleted the control-flow: image

Environment: NNI version: v3.0rc1 Training service (local|remote|pai|aml|etc): local Python version: 3.8.5 PyTorch version: 1.11.0 Cpu or cuda version: cpu

Reproduce the problem

J-shang commented 1 year ago

hello @HuYue233 , many reasons may cause the nodes of the two graphs to not match, some are caused by randomness, some are caused by dead nodes, we have updated the logic of cleaning up dead nodes in the latest version, you can try to install from source code and try again to see if the problem is caused by dead nodes.

Lijiaoa commented 1 year ago

Could you give us some updates? @HuYue233

HuYue233 commented 1 year ago

I commented out line 47 of the code. 屏幕截图 2023-07-05 093856