Closed Ruiqi-Shen closed 3 months ago
Hi there, it's a bit unfortunate that we still failed at the software check, specifically, there is a System.IO.IOException: No space left on device
within mase regression test, after trying for two sleepless days for more than 50 commits.
All the tests in test-machop.sh
could work fine, however, after that, within pytest --log-level=DEBUG --verbose -n 1 --cov=machop/chop/ --cov-report=html machop/test/ --html=report.html --self-contained-html --profile --profile-svg
, the program just stay there for 5-6 minutes and then break down, with an output error of running out of disk space
, preventing the program from proceeding.
Here is the screenshot, you could also view it in Details
in final
commit.
We've tried to clean cache, as well as docker image on the system, but we seem to either lack permission or does not work.
Previously, this issue did not occur, and we debugged our program and have achieved all the tests could run normally in mase regression test, meaning that as long as this disk space problem is solved, all the tests will run gracefully with no error as they have already been debugged.
We're also sure that our code could run well independently, and it could carry out all the functions and methods that we've described. To test this, you could try on any environment, we recommend Colab and have given a demo: group3.ipynb
, in which we include a command line based on transform
as well as an independent test function test_group3.py
which integrate all functionalities (pruning, post-prune quantization, training and Huffman coding).
We have made modifications made specially for pruning (i.e. altered the parameters of the original function), and we implemented them gracefully so that they could be suited well for other parts within MASE.
Example1: load_model
:
model = load_model(model_short_name, mask, is_quantize, load_name, load_type, model)
, where:
model_short_name
an be models like VGG7 or ResNet18, where we test for pruning generalization ability;
mask
refers to weight mask, we use this to add new keys of "mask" when loading state_dict,
is_quantize
refers to whether we've used post-prune quantization. If yes, we also need to do corresponding changes,
Example2: PASSES["add_pruning_metadata"]
:
graph, sparisity_info, weight_masks, act_masks = PASSES["add_pruning_metadata"](graph, ...)
, where:
weight_masks
and activation masks
refer to the masks generated by dummy input during weight and activation pruning.
(Note that activation pruning is typically implemented as a dynamic pruning process, where its masks adapt to different inputs. For testing purposes, we document the activation masks
generated in their initial state to evaluate the potential drawbacks of static activation pruning.)
There are multiple cases like these, all of them could run gracefully within the original MASE system.
Btw, when doing PR, we also found that the original "train" function has problems when adding metadata (both common and software metadata), and we've fixed it so that it could work.
We sincerely hope that our code can contribute to enhancing the pruning performance of the MASE system and offer assistance.
Thanks very much ! Group3
It looks like you pushed lightning_logs
in the PR, which takes a lot of space and should not be included in the PR...
Dear Jianyi,
Thanks so much for the reminder !
I've removed remove the lightning_logs, specifically node_meta_param.
The other parts of the code have remain totally unchanged.
The PR has passed the software test, which means it passes all 5 tests now(5/5), and all the code has been working properly.
(Btw, as could be seen in the report, we meant to save the model after each independent pass including pruning, quantization, training and Huffman coding, so as to compare metrics such as reduction in model size, this might take much space, but it all works fine now.)
Thanks once again ! If has any questions, please don't hesitate to tell us.
Best Regards, Group 3
From: Jianyi Cheng @.> Sent: 03 April 2024 11:17 To: DeepWok/mase @.> Cc: SHEN, Ruiqi @.>; Author @.> Subject: Re: [DeepWok/mase] Pruning and Training for MASE - Group 3 - Final (PR #99)
This email from @.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders listhttps://spam.ic.ac.uk/SpamConsole/Senders.aspx to disable email stamping for this address.
It looks like you pushed lightning_logs in the PR, which takes a lot of space and should not be included in the PR...
— Reply to this email directly, view it on GitHubhttps://github.com/DeepWok/mase/pull/99#issuecomment-2034162496, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BHMPAWF2AK3JRG3PYKAOSBTY3PJKZAVCNFSM6AAAAABFPZYBOKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZUGE3DENBZGY. You are receiving this because you authored the thread.
Pruning and Training for MASE - Group 3
For analysis related to the
issues in the software-mase-regression-test
please refer to the first comment, thanks!Functionality
Basic Elements:
Extenstions:
Getting started
How to run
Our code is at
https://github.com/Ruiqi-Shen/mase
, themain
branch. If possible, please user a larger RAM.Both cpu and gpu works fine on our program, but we recommend gpu.
Please execute all of our programs in the
machop("mase/machop")
directory.If need pre-trained model, please put the pre-trained VGG7 model at
mase/test-accu-0.9332.ckpt
Our test function is
test_group3.py
inside the existing testing framework, run in command line using:You can also execute the transform function via the command line using
You might change configuration as you wish.
As there are too many configurations, we kept them inside a toml file at
configs/example/prune_retrain_group3.toml
Please refer to the file for default parameter values and to change them.Additionally, we provide a demo notebook,
group3.ipynb
, placed inmase/group3.ipynb
, which is readily executable on Colab. It includes bothtransform function
run from command line, and our test functiontest_group3.py
. Please changeload_model
to Colab path, such as/content/drive/MyDrive/test-accu-0.9332.ckpt
Example output
Below is a demonstration of an actual output under certain pruning prerequisites:
In summary, it is evident that the model can maintain or even slightly improve its validation accuracy while undergoing significant model compression, achieving the desired outcome.
Note: Actual model size reduction on hardware requires compiler-level modifications. Theoretical strategies still signify a major advancement, with potential drastic reductions upon compiler adjustments. Please refer to the detailed discussion in the report.
Model Storage
Note that we save the model for all passes.
For prune, quantization, train:
If you run the test, find the saved models at:
mase_output/group3_test
Or if you run the transform command, find the saved models at:
mase_output/{project}/software/transforms
For Huffman encoding, find the saved model at:
machop/chop/huffman_model.ckpt
Implementation Overview
Please refer to the Methodology part of the report for detailed illustration and visualization.
Overall Pipeline
Each component within the pipeline is executed through an autonomous pass within
transform.py
, allowing for the flexible selection and combination of passes to suit specific requirements.Pruning Methods
Specifically, below are all the pruning methods that we've implemented:
Weight pruning:
Different granualarities of weight pruning:
Activation pruning:
Different focus of activation pruning:
Please refer to
pruning_methods.py
for their specific names.For the detailed analysis on their principles and performance, as well as the multiple evaluating metrics, please refer to the report.
Training
We use PyTorch Lightning for model training.
The model is constructed with the specified architecture and loaded with pre-pruned weights.
For demonstration, we set
epoch=1
andlimit_train_batches=1
by default. You could change them.Post-prune Quantization & Huffman Coding
Additionally, inspired by the methodology fromDEEP COMPRESSION: COMPRESSING DEEP NEURAL NETWORKS WITH PRUNING, TRAINED QUANTIZATION AND HUFFMAN CODING, we've implemented post-prune quantization and Huffman Coding.
The post-prune quantization convert the 32-bit float data into an 8-bit format with 4 bits allocated for the fractional part.
Huffman Encoding takes the advantage of the newly quantized data, it uses variable-length encoding to encode more common weight values with shorter bits, effectively compressing the model further.
By default, these two further model compression techniques are enabled, but you can choose to disable them by commenting all
passes.quantize
and setis_huffman = false
Note that quantization must be valid for Huffman encoding.
Train from scratch && Transferability to other models and datasets
By default, the model loads the pre-trained VGG7 model for pruning and training.
If desired, you can opt to train from scratch by setting
load_name = None
.Moreover, you are free to select different datasets and models. The ResNet18 network and colored-MNIST are fully compatible with our pipeline and yield satisfactory results. To utilize these, please modify the toml configuration as follows:
Contact
Feel free to contact us at ruiqi.shen23@imperial.ac.uk if you have encountered any problems.