IntelLabs / distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Apache License 2.0
4.35k stars 802 forks source link

Structure pruning is broken for models with non-serial connections #85

Closed nzmora closed 5 years ago

nzmora commented 5 years ago

Structure pruning is broken for models with non-serial connections. Models such as Alexnet and VGG are have serial data-dependencies (connections) and are fine. More complex models, with parallel-data dependencies (paths), such as ResNets (skip connections) and GoogLeNet (Inception layers) might fail when pruning filters or channels.

This is because a module, such as torch.nn.modules.batchnorm.BatchNorm2d layers, might depend on multiple inputs. This is not always a problem. For example, if the dependent module has type torch.nn.Conv2d and we are pruning weight filters. But if the dependent module has type torch.nn.modules.batchnorm.BatchNorm2d, and we are pruning weight filters, then it is possible that each of the inputs selects different activation channels to prune. In such a case, how should we prune the BatchNorm's scale and shift tensors (.weight and .bias)?

To solve this we need to define one of the modules as the leader which determines what activation channels to prune; and define the rest of the modules in the dependency sub-graph as followers. Followers do not choose which activation channels to prune, so their sparsity masks is determined by the choice of the leader.
Because the sparsity maps of different follower modules may have different shapes, the leader defines a binary map which is a binary vector of active (1) and pruned (0) channels. Each "follower" expands this single binary map to create its own private pruning mask.

This requires changing the way we express filter/channel pruning in YAML, and how we create pruning masks.

I'm trying to make this fix available soon. This is related to issues #79 and #73.

nzmora commented 5 years ago

@cattpku @vinutah @chenys1995 @promach I merged to 'master' a PR with the fix for issue #85 which should also resolve #79 #86 #73 #36.

There are YAML API for structured-pruning changes with examples. I also added some explanations about the API here: https://github.com/NervanaSystems/distiller/wiki/Pruning-Filters-&-Channels

buttercutter commented 5 years ago

@nzmora I still face similar issue after doing a git pull. Did I miss any steps for your code fix ?

[phung@archlinux distiller]$ git pull remote: Enumerating objects: 39, done. remote: Counting objects: 100% (39/39), done. remote: Compressing objects: 100% (12/12), done. remote: Total 41 (delta 29), reused 34 (delta 27), pack-reused 2 Unpacking objects: 100% (41/41), done. From https://github.com/NervanaSystems/distiller ea173cb..a0bf2a8 master -> origin/master 6940eb4..7565638 issue_85 -> origin/issue_85 Updating ea173cb..a0bf2a8 Fast-forward distiller/pruning/automated_gradual_pruner.py | 127 +++++++++++++++++++++++++++----------------------- distiller/pruning/ranked_structures_pruner.py | 305 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------ distiller/sensitivity.py | 15 ++++-- distiller/thinning.py | 90 +++++++++++++++++++++++------------- distiller/thresholding.py | 74 +++++++++++++++++++++++++---- distiller/utils.py | 8 +++- examples/agp-pruning/resnet20_filters.schedule_agp.yaml | 35 ++++---------- examples/agp-pruning/resnet20_filters.schedule_agp_2.yaml | 26 ++++------- examples/agp-pruning/resnet20_filters.schedule_agp_3.yaml | 26 ++++------- examples/agp-pruning/resnet20_filters.schedule_agp_4.yaml | 154 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ examples/classifier_compression/logging.conf | 12 ++--- examples/network_trimming/resnet56_cifar_activation_apoz.yaml | 71 ++++++++++++---------------- examples/network_trimming/resnet56_cifar_activation_apoz_v2.yaml | 55 +++++++++++----------- examples/pruning_filters_for_efficient_convnets/resnet56_cifar_channel_rank.yaml | 189 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ examples/pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank.yaml | 114 ++++++++++++++++++++++----------------------- examples/pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank_v2.yaml | 114 ++++++++++++++++++++++----------------------- examples/pruning_filters_for_efficient_convnets/vgg19.schedule_filter_rank.yaml | 39 ++++++++-------- imgs/pruning_structs_ex1.png | Bin 0 -> 60148 bytes imgs/pruning_structs_ex2.png | Bin 0 -> 80216 bytes imgs/pruning_structs_ex3.png | Bin 0 -> 107870 bytes imgs/pruning_structs_ex4.png | Bin 0 -> 114971 bytes imgs/pruning_structs_ex5.png | Bin 0 -> 111372 bytes tests/test_pruning.py | 12 +++-- tests/test_ranking.py | 11 +++-- 24 files changed, 971 insertions(+), 506 deletions(-) create mode 100755 examples/agp-pruning/resnet20_filters.schedule_agp_4.yaml create mode 100755 examples/pruning_filters_for_efficient_convnets/resnet56_cifar_channel_rank.yaml create mode 100755 imgs/pruning_structs_ex1.png create mode 100755 imgs/pruning_structs_ex2.png create mode 100755 imgs/pruning_structs_ex3.png create mode 100755 imgs/pruning_structs_ex4.png create mode 100755 imgs/pruning_structs_ex5.png [phung@archlinux distiller]$ cd examples/classifier_compression/ [phung@archlinux classifier_compression]$ ls compress_classifier.py init.py inspect_ckpt.py logging.conf logs weights.csv [phung@archlinux classifier_compression]$ time python compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank_v2.yaml --resume=../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar -j=1 --deterministic Log file for this run: /home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/examples/classifier_compression/logs/2018.12.02-092329/2018.12.02-092329.log ==> using cifar10 dataset => creating resnet56_cifar model for CIFAR10


Logging to TensorBoard - remember to execute the server:

tensorboard --logdir='./logs'

=> loading checkpoint ../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar Checkpoint keys: compression_sched best_top1 optimizer state_dict epoch arch best top@1: 92.920 Loaded compression schedule from checkpoint (epoch 179) => loaded checkpoint '../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar' (epoch 179) Optimizer Type: <class 'torch.optim.sgd.SGD'> Optimizer Args: {'lr': 0.1, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0.0001, 'nesterov': False} Files already downloaded and verified Files already downloaded and verified Dataset sizes: training=45000 validation=5000 test=10000 Reading compression schedule from: ../pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank_v2.yaml

L1RankedStructureParameterPruner - param: module.layer1.0.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.0.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.1.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.1.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.2.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.2.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.3.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.3.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.4.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.4.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.5.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.5.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.6.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.6.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.7.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.7.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.8.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.8.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer2.1.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.1.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.2.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.2.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.3.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.3.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.4.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.4.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.6.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.6.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.7.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.7.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer3.2.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.2.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.3.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.3.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.5.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.5.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.6.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.6.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.7.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.7.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.8.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.8.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.1.conv1.weight pruned=(12/64) L1RankedStructureParameterPruner - param: module.layer3.1.conv1.weight pruned=0.188 goal=0.200 Training epoch: 45000 samples (256 per mini-batch) ==> using cifar10 dataset => creating resnet56_cifar model for CIFAR10

Log file for this run: /home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/examples/classifier_compression/logs/2018.12.02-092329/2018.12.02-092329.log Traceback (most recent call last): File "compress_classifier.py", line 789, in main() File "compress_classifier.py", line 386, in main loggers=[tflogger, pylogger], args=args) File "compress_classifier.py", line 464, in train compression_scheduler.on_minibatch_begin(epoch, train_step, steps_per_epoch, optimizer) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/distiller/scheduler.py", line 120, in on_minibatch_begin self.zeros_mask_dict, meta, optimizer) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/distiller/thinning.py", line 416, in on_minibatch_begin self.apply(model, zeros_mask_dict, optimizer) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/distiller/thinning.py", line 407, in __apply self.thinning_func(model, zeros_mask_dict, self.arch, self.dataset, optimizer=optimizer) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/distiller/thinning.py", line 223, in remove_filters sgraph = create_graph(dataset, arch) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/distiller/thinning.py", line 76, in create_graph return SummaryGraph(model, dummy_input.cuda()) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/apputils/model_summaries.py", line 102, in init torch.onnx._optimize_trace(trace, False) File "/usr/lib/python3.7/site-packages/torch/onnx/init__.py", line 42, in _optimize_trace trace.set_graph(utils._optimize_graph(trace.graph(), operator_export_type)) File "/usr/lib/python3.7/site-packages/torch/onnx/utils.py", line 153, in _optimize_graph if operator_export_type != OperatorExportTypes.RAW: TypeError: ne(): incompatible function arguments. The following argument types are supported:

  1. (self: torch._C._onnx.OperatorExportTypes, arg0: torch._C._onnx.OperatorExportTypes) -> bool

Invoked with: OperatorExportTypes.RAW, False

real 0m10.876s user 0m5.783s sys 0m1.459s [phung@archlinux classifier_compression]$ sudo cp /usr/lib/python3.7/site-packages/torch/onnx/utils.py /usr/lib/python3.7/site-packages/torch/onnx/utils_original.py [sudo] password for phung: [phung@archlinux classifier_compression]$ sudo nano /usr/lib/python3.7/site-packages/torch/onnx/utils.py [phung@archlinux classifier_compression]$ time python compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank_v2.yaml --resume=../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar -j=1 --deterministic Log file for this run: /home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/examples/classifier_compression/logs/2018.12.02-092513/2018.12.02-092513.log ==> using cifar10 dataset => creating resnet56_cifar model for CIFAR10


Logging to TensorBoard - remember to execute the server:

tensorboard --logdir='./logs'

=> loading checkpoint ../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar Checkpoint keys: compression_sched best_top1 optimizer state_dict epoch arch best top@1: 92.920 Loaded compression schedule from checkpoint (epoch 179) => loaded checkpoint '../pruning_filters_for_efficient_convnets/checkpoints/checkpoint.resnet56_cifar_baseline.pth.tar' (epoch 179) Optimizer Type: <class 'torch.optim.sgd.SGD'> Optimizer Args: {'lr': 0.1, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0.0001, 'nesterov': False} Files already downloaded and verified Files already downloaded and verified Dataset sizes: training=45000 validation=5000 test=10000 Reading compression schedule from: ../pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank_v2.yaml

L1RankedStructureParameterPruner - param: module.layer1.0.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.0.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.1.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.1.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.2.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.2.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.3.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.3.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.4.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.4.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.5.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.5.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.6.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.6.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.7.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.7.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer1.8.conv1.weight pruned=(11/16) L1RankedStructureParameterPruner - param: module.layer1.8.conv1.weight pruned=0.688 goal=0.700 L1RankedStructureParameterPruner - param: module.layer2.1.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.1.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.2.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.2.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.3.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.3.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.4.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.4.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.6.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.6.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer2.7.conv1.weight pruned=(19/32) L1RankedStructureParameterPruner - param: module.layer2.7.conv1.weight pruned=0.594 goal=0.600 L1RankedStructureParameterPruner - param: module.layer3.2.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.2.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.3.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.3.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.5.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.5.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.6.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.6.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.7.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.7.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.8.conv1.weight pruned=(25/64) L1RankedStructureParameterPruner - param: module.layer3.8.conv1.weight pruned=0.391 goal=0.400 L1RankedStructureParameterPruner - param: module.layer3.1.conv1.weight pruned=(12/64) L1RankedStructureParameterPruner - param: module.layer3.1.conv1.weight pruned=0.188 goal=0.200 Training epoch: 45000 samples (256 per mini-batch) ==> using cifar10 dataset => creating resnet56_cifar model for CIFAR10 Invoking create_thinning_recipe_filters In tensor module.layer1.0.conv1.weight found 11/16 zero filters In tensor module.layer1.1.conv1.weight found 11/16 zero filters In tensor module.layer1.2.conv1.weight found 11/16 zero filters In tensor module.layer1.3.conv1.weight found 11/16 zero filters In tensor module.layer1.4.conv1.weight found 11/16 zero filters In tensor module.layer1.5.conv1.weight found 11/16 zero filters In tensor module.layer1.6.conv1.weight found 11/16 zero filters In tensor module.layer1.7.conv1.weight found 11/16 zero filters In tensor module.layer1.8.conv1.weight found 11/16 zero filters In tensor module.layer2.1.conv1.weight found 19/32 zero filters In tensor module.layer2.2.conv1.weight found 19/32 zero filters In tensor module.layer2.3.conv1.weight found 19/32 zero filters In tensor module.layer2.4.conv1.weight found 19/32 zero filters In tensor module.layer2.6.conv1.weight found 19/32 zero filters In tensor module.layer2.7.conv1.weight found 19/32 zero filters In tensor module.layer3.1.conv1.weight found 12/64 zero filters In tensor module.layer3.2.conv1.weight found 25/64 zero filters In tensor module.layer3.3.conv1.weight found 25/64 zero filters In tensor module.layer3.5.conv1.weight found 25/64 zero filters In tensor module.layer3.6.conv1.weight found 25/64 zero filters In tensor module.layer3.7.conv1.weight found 25/64 zero filters In tensor module.layer3.8.conv1.weight found 25/64 zero filters Created, applied and saved a thinning recipe

Log file for this run: /home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/examples/classifier_compression/logs/2018.12.02-092513/2018.12.02-092513.log Traceback (most recent call last): File "compress_classifier.py", line 789, in main() File "compress_classifier.py", line 386, in main loggers=[tflogger, pylogger], args=args) File "compress_classifier.py", line 467, in train output = model(inputs) File "/usr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/usr/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 124, in forward return self.module(*inputs[0], *kwargs[0]) File "/usr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(input, kwargs) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/models/cifar10/resnet_cifar.py", line 140, in forward x = self.layer1(x) File "/usr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, kwargs) File "/usr/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/usr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, *kwargs) File "/home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/models/cifar10/resnet_cifar.py", line 71, in forward out = self.bn1(out) File "/usr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(input, kwargs) File "/usr/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 67, in forward exponential_average_factor, self.eps) File "/usr/lib/python3.7/site-packages/torch/nn/functional.py", line 1349, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: running_mean should contain 5 elements not 16

real 0m9.714s user 0m7.129s sys 0m1.412s [phung@archlinux classifier_compression]$

chenys1995 commented 5 years ago

I am confuse about the "concat1" in pruning_structs_ex4.png It seems like a element-wise add In the case of the channel-wise concatenation (1) conv1 -> conv3 (2) conv2 -> conv3(with offset index) can be done without any confusion(I guess)

nzmora commented 5 years ago

Hi @promach,

Regarding the problem you see : it appears to be caused by PyTorch 0.4.1 which is not supported by Distiller yet.

Cheers Neta

nzmora commented 5 years ago

Hi @chenys1995,

Regarding your question: I read through the example I gave and I can see that in the diagram I showed an element-wise summation, and in the text I referred to concatenation and that this is confusing and wrong. So this is now fixed.

BTW, if instead of performing element-wise addition we used concatenation, then we would have an even more complex dependency: from the weights of conv3 we would need to remove the channels corresponding to the filters pruned by both conv1 and conv2. This is not supported yet.

Thanks, Neta

buttercutter commented 5 years ago

Hi @promach,

Regarding the problem you see : it appears to be caused by PyTorch 0.4.1 which is not supported by Distiller yet.

Cheers Neta

@nzmora I am not being pushy and I am also not contributing towards distiller development. I am just a normal user. May I know why distiller still not supporting latest pytorch ? Most systems are now using pytorch version 1.0 ?

Please do not take my question as an attack. I just wish to use distiller on my system.

Is there way to use distiller together some minor code modifications using pytorch version 1.0 ?

Using virtualenv also does not help:

[phung@archlinux distiller]$ python -m virtualenv env Using base prefix '/usr' /usr/lib/python3.7/site-packages/virtualenv.py:1041: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp New python executable in /home/phung/Documents/Grive/Personal/Coursera/Machine_Learning/pruning/distiller/env/bin/python Installing setuptools, pip, wheel...done. [phung@archlinux distiller]$ source env/bin/activate (env) [phung@archlinux distiller]$ pip install -r requirements.txt Collecting torch==0.4.0 (from -r requirements.txt (line 1)) Could not find a version that satisfies the requirement torch==0.4.0 (from -r requirements.txt (line 1)) (from versions: 0.1.2, 0.1.2.post1, 0.4.1, 0.4.1.post2, 1.0.0) No matching distribution found for torch==0.4.0 (from -r requirements.txt (line 1)) (env) [phung@archlinux distiller]$

nzmora commented 5 years ago

Hi @promach,

This is a fair question. We've been planning support for PyTorch 1.0 for some time, but it wasn't high on our priority list. Anyway, I hope to be able to release PyTorch 1.0 support in a few weeks.

I'm sorry you are having issues installing PyTorch 0.4. This version is still supported and using requirements.txt should work for you.
The easiest way to list the available versions of a package is to try installing an illegal version. The error message lists the available versions and you can see 0.4 is supported:

$ pip3 install torch==700
Collecting torch==700
  Could not find a version that satisfies the requirement torch==700 (from versions: 0.1.2, 0.1.2.post1, 0.3.1, 0.4.0, 0.4.1, 1.0.0)
No matching distribution found for torch==700

You can also see it listed in PyPi: https://pypi.org/project/torch/0.4.0/

Please let me know if you figure out why you can't install pytorch 0.4 on your system - that should not be the case. Cheers, Neta

buttercutter commented 5 years ago

Thanks @nzmora a lot for trying to support pytorch version 1.0

[phung@archlinux distiller]$ source env/bin/activate (env) [phung@archlinux distiller]$ pip install torch==0.4.0 Collecting torch==0.4.0 Could not find a version that satisfies the requirement torch==0.4.0 (from versions: 0.1.2, 0.1.2.post1, 0.4.1, 0.4.1.post2, 1.0.0) No matching distribution found for torch==0.4.0 (env) [phung@archlinux distiller]$ pip install -r requirements.txt Collecting torch==0.4.0 (from -r requirements.txt (line 1)) Could not find a version that satisfies the requirement torch==0.4.0 (from -r requirements.txt (line 1)) (from versions: 0.1.2, 0.1.2.post1, 0.4.1, 0.4.1.post2, 1.0.0) No matching distribution found for torch==0.4.0 (from -r requirements.txt (line 1)) (env) [phung@archlinux distiller]$

nzmora commented 5 years ago

@promach Note that in the link that I sent you there's a button "Download Files" that, if you press, gives you a list of PyTorch 0.4 packages to download directly (https://pypi.org/project/torch/0.4.0/#files). Cheers, Neta

buttercutter commented 5 years ago

See the log below for whl package incompatibility with my system-wide python

Take your time slowly to do the development stuff for pytorch version 1.0 support. I am only a casual user, planning to use distiller once it is ready for pytorch 1.0 in the near future.

[phung@archlinux distiller]$ source env/bin/activate (env) [phung@archlinux distiller]$ ls env/ bin include lib pip-selfcheck.json torch-0.4.0-cp36-cp36m-manylinux1_x86_64.whl (env) [phung@archlinux distiller]$ pip install env/torch-0.4.0-cp36-cp36m-manylinux1_x86_64.whl torch-0.4.0-cp36-cp36m-manylinux1_x86_64.whl is not a supported wheel on this platform. (env) [phung@archlinux distiller]$ python Python 3.7.2 (default, Dec 29 2018, 21:15:15) [GCC 8.2.1 20181127] on linux Type "help", "copyright", "credits" or "license" for more information.

quit Use quit() or Ctrl-D (i.e. EOF) to exit

(env) [phung@archlinux distiller]$ python

python python2-config python3.7-config python-argcomplete-check-easy-install-script python2 python2-pyuic4 python3.7m python-argcomplete-tcsh python2.7 python3 python3.7m-config python-config python2.7-config python3.7 python3-config
(env) [phung@archlinux distiller]$ python

Let me try pytorch 0.40 whl package for python 2.7 , but I am concerned that distiller code had already moved away from python 2 version