tfoote / nps_me4823

A repository with a dockerfile for testing novnc deployments
Apache License 2.0
0 stars 1 forks source link

Matlab image is 29 GiB #9

Closed M1chaelM closed 3 years ago

M1chaelM commented 3 years ago

Using a multistage build process reduces the Matlab image size from 44GiB to 29GiB. It's hard to get smaller than this for a full install of Matlab because /usr/local/MATLAB is 26GiB all by itself. Here is the breakdown:

4.0K    /usr/local/MATLAB/VersionInfo.xml
106M    /usr/local/MATLAB/appdata
7.4G    /usr/local/MATLAB/bin
144M    /usr/local/MATLAB/cefclient
63M     /usr/local/MATLAB/derived
2.8G    /usr/local/MATLAB/examples
9.4M    /usr/local/MATLAB/extern
4.9G    /usr/local/MATLAB/help
2.2M    /usr/local/MATLAB/interprocess
185M    /usr/local/MATLAB/java
84K     /usr/local/MATLAB/license_agreement.txt
1.2G    /usr/local/MATLAB/mcr
16K     /usr/local/MATLAB/patents.txt
14M     /usr/local/MATLAB/polyspace
148K    /usr/local/MATLAB/remote
242M    /usr/local/MATLAB/resources
13M     /usr/local/MATLAB/rtw
1.1M    /usr/local/MATLAB/runtime
2.1M    /usr/local/MATLAB/simulink
939M    /usr/local/MATLAB/sys
7.9G    /usr/local/MATLAB/toolbox
4.0K    /usr/local/MATLAB/trademarks.txt
132M    /usr/local/MATLAB/ui

The only way I can think of to reduce space is to avoid a full install, but I'm not sure which modules we do and do not need. I will post the whole list below, and if we want to try to slim down we can pick out specific products that are required.

M1chaelM commented 3 years ago

The full list of products is as follows:

#product.5G_Toolbox
#product.AUTOSAR_Blockset
#product.Aerospace_Blockset
#product.Aerospace_Toolbox
#product.Antenna_Toolbox
#product.Audio_Toolbox
#product.Automated_Driving_Toolbox
#product.Bioinformatics_Toolbox
#product.Communications_Toolbox
#product.Computer_Vision_Toolbox
#product.Control_System_Toolbox
#product.Curve_Fitting_Toolbox
#product.DDS_Blockset
#product.DO_Qualification_Kit
#product.DSP_System_Toolbox
#product.Data_Acquisition_Toolbox
#product.Database_Toolbox
#product.Datafeed_Toolbox
#product.Deep_Learning_HDL_Toolbox
#product.Deep_Learning_Toolbox
#product.Econometrics_Toolbox
#product.Embedded_Coder
#product.Filter_Design_HDL_Coder
#product.Financial_Instruments_Toolbox
#product.Financial_Toolbox
#product.Fixed-Point_Designer
#product.Fuzzy_Logic_Toolbox
#product.GPU_Coder
#product.Global_Optimization_Toolbox
#product.HDL_Coder
#product.HDL_Verifier
#product.IEC_Certification_Kit
#product.Image_Acquisition_Toolbox
#product.Image_Processing_Toolbox
#product.Instrument_Control_Toolbox
#product.LTE_Toolbox
#product.Lidar_Toolbox
#product.MATLAB
#product.MATLAB_Coder
#product.MATLAB_Compiler
#product.MATLAB_Compiler_SDK
#product.MATLAB_Parallel_Server
#product.MATLAB_Production_Server
#product.MATLAB_Report_Generator
#product.MATLAB_Web_App_Server
#product.Mapping_Toolbox
#product.Mixed-Signal_Blockset
#product.Model_Predictive_Control_Toolbox
#product.Model-Based_Calibration_Toolbox
#product.Motor_Control_Blockset
#product.Navigation_Toolbox
#product.OPC_Toolbox
#product.Optimization_Toolbox
#product.Parallel_Computing_Toolbox
#product.Partial_Differential_Equation_Toolbox
#product.Phased_Array_System_Toolbox
#product.Polyspace_Bug_Finder
#product.Polyspace_Bug_Finder_Server
#product.Polyspace_Code_Prover
#product.Polyspace_Code_Prover_Server
#product.Powertrain_Blockset
#product.Predictive_Maintenance_Toolbox
#product.RF_Blockset
#product.RF_Toolbox
#product.ROS_Toolbox
#product.Radar_Toolbox
#product.Reinforcement_Learning_Toolbox
#product.Risk_Management_Toolbox
#product.Robotics_System_Toolbox
#product.Robust_Control_Toolbox
#product.Satellite_Communications_Toolbox
#product.Sensor_Fusion_and_Tracking_Toolbox
#product.SerDes_Toolbox
#product.Signal_Processing_Toolbox
#product.SimBiology
#product.SimEvents
#product.Simscape
#product.Simscape_Driveline
#product.Simscape_Electrical
#product.Simscape_Fluids
#product.Simscape_Multibody
#product.Simulink
#product.Simulink_3D_Animation
#product.Simulink_Check
#product.Simulink_Code_Inspector
#product.Simulink_Coder
#product.Simulink_Compiler
#product.Simulink_Control_Design
#product.Simulink_Coverage
#product.Simulink_Design_Optimization
#product.Simulink_Design_Verifier
#product.Simulink_Desktop_Real-Time
#product.Simulink_PLC_Coder
#product.Simulink_Real-Time
#product.Simulink_Report_Generator
#product.Simulink_Requirements
#product.Simulink_Test
#product.SoC_Blockset
#product.Spreadsheet_Link
#product.Stateflow
#product.Statistics_and_Machine_Learning_Toolbox
#product.Symbolic_Math_Toolbox
#product.System_Composer
#product.System_Identification_Toolbox
#product.Text_Analytics_Toolbox
#product.UAV_Toolbox
#product.Vehicle_Dynamics_Blockset
#product.Vehicle_Network_Toolbox
#product.Vision_HDL_Toolbox
#product.WLAN_Toolbox
#product.Wavelet_Toolbox
#product.Wireless_HDL_Toolbox
M1chaelM commented 3 years ago

Here is a slimmed down list of products Brian typically uses:

-----------------------------------------------------------------------------------------------------
MATLAB Version: 9.7.0.1319299 (R2019b) Update 5
MATLAB License Number: 827417
Operating System: Linux 5.3.0-42-generic #34~18.04.1-Ubuntu SMP Fri Feb 28 13:42:26 UTC 2020 x86_64
Java Version: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
-----------------------------------------------------------------------------------------------------
MATLAB Version 9.7 (R2019b)
Simulink Version 10.0 (R2019b)
Aerospace Blockset Version 4.2 (R2019b)
Aerospace Toolbox Version 3.2 (R2019b)
Communications Toolbox Version 7.2 (R2019b)
Computer Vision Toolbox Version 9.1 (R2019b)
Control System Toolbox Version 10.7 (R2019b)
Curve Fitting Toolbox Version 3.5.10 (R2019b)
DSP System Toolbox Version 9.9 (R2019b)
Deep Learning Toolbox Version 13.0 (R2019b)
Embedded Coder Version 7.3 (R2019b)
Image Acquisition Toolbox Version 6.1 (R2019b)
Image Processing Toolbox Version 11.0 (R2019b)
Instrument Control Toolbox Version 4.1 (R2019b)
MATLAB Coder Version 4.3 (R2019b)
MATLAB Compiler Version 7.1 (R2019b)
MATLAB Compiler SDK Version 6.7 (R2019b)
MATLAB Report Generator Version 5.7 (R2019b)
Model Predictive Control Toolbox Version 6.3.1 (R2019b)
Navigation Toolbox Version 1.0 (R2019b)
Optimization Toolbox Version 8.4 (R2019b)
ROS Toolbox Version 1.0 (R2019b)
Reinforcement Learning Toolbox Version 1.1 (R2019b)
Robotics System Toolbox Version 3.0 (R2019b)
Signal Processing Toolbox Version 8.3 (R2019b)
Simulink Code Inspector Version 3.5 (R2019b)
Simulink Coder Version 9.2 (R2019b)
Stateflow Version 10.1 (R2019b)
Statistics and Machine Learning Toolbox Version 11.6 (R2019b)
Symbolic Math Toolbox Version 8.4 (R2019b)
System Identification Toolbox Version 9.11 (R2019b)
M1chaelM commented 3 years ago

Based on the above, here is a new version of the relevant section of the installer_input.txt file for our Matlab install:

#product.5G_Toolbox
#product.AUTOSAR_Blockset
product.Aerospace_Blockset
product.Aerospace_Toolbox
#product.Antenna_Toolbox
#product.Audio_Toolbox
#product.Automated_Driving_Toolbox
#product.Bioinformatics_Toolbox
product.Communications_Toolbox
product.Computer_Vision_Toolbox
product.Control_System_Toolbox
product.Curve_Fitting_Toolbox
#product.DDS_Blockset
#product.DO_Qualification_Kit
product.DSP_System_Toolbox
#product.Data_Acquisition_Toolbox
#product.Database_Toolbox
#product.Datafeed_Toolbox
#product.Deep_Learning_HDL_Toolbox
product.Deep_Learning_Toolbox
#product.Econometrics_Toolbox
product.Embedded_Coder
#product.Filter_Design_HDL_Coder
#product.Financial_Instruments_Toolbox
#product.Financial_Toolbox
#product.Fixed-Point_Designer
#product.Fuzzy_Logic_Toolbox
#product.GPU_Coder
#product.Global_Optimization_Toolbox
#product.HDL_Coder
#product.HDL_Verifier
#product.IEC_Certification_Kit
product.Image_Acquisition_Toolbox
product.Image_Processing_Toolbox
product.Instrument_Control_Toolbox
#product.LTE_Toolbox
#product.Lidar_Toolbox
product.MATLAB
product.MATLAB_Coder
product.MATLAB_Compiler
product.MATLAB_Compiler_SDK
#product.MATLAB_Parallel_Server
#product.MATLAB_Production_Server
product.MATLAB_Report_Generator
#product.MATLAB_Web_App_Server
#product.Mapping_Toolbox
#product.Mixed-Signal_Blockset
product.Model_Predictive_Control_Toolbox
#product.Model-Based_Calibration_Toolbox
#product.Motor_Control_Blockset
product.Navigation_Toolbox
#product.OPC_Toolbox
product.Optimization_Toolbox
#product.Parallel_Computing_Toolbox
#product.Partial_Differential_Equation_Toolbox
#product.Phased_Array_System_Toolbox
#product.Polyspace_Bug_Finder
#product.Polyspace_Bug_Finder_Server
#product.Polyspace_Code_Prover
#product.Polyspace_Code_Prover_Server
#product.Powertrain_Blockset
#product.Predictive_Maintenance_Toolbox
#product.RF_Blockset
#product.RF_Toolbox
product.ROS_Toolbox
#product.Radar_Toolbox
product.Reinforcement_Learning_Toolbox
#product.Risk_Management_Toolbox
product.Robotics_System_Toolbox
#product.Robust_Control_Toolbox
#product.Satellite_Communications_Toolbox
#product.Sensor_Fusion_and_Tracking_Toolbox
#product.SerDes_Toolbox
product.Signal_Processing_Toolbox
#product.SimBiology
#product.SimEvents
#product.Simscape
#product.Simscape_Driveline
#product.Simscape_Electrical
#product.Simscape_Fluids
#product.Simscape_Multibody
product.Simulink
#product.Simulink_3D_Animation
#product.Simulink_Check
product.Simulink_Code_Inspector
product.Simulink_Coder
#product.Simulink_Compiler
#product.Simulink_Control_Design
#product.Simulink_Coverage
#product.Simulink_Design_Optimization
#product.Simulink_Design_Verifier
#product.Simulink_Desktop_Real-Time
#product.Simulink_PLC_Coder
#product.Simulink_Real-Time
#product.Simulink_Report_Generator
#product.Simulink_Requirements
#product.Simulink_Test
#product.SoC_Blockset
#product.Spreadsheet_Link
product.Stateflow
product.Statistics_and_Machine_Learning_Toolbox
product.Symbolic_Math_Toolbox
#product.System_Composer
product.System_Identification_Toolbox
#product.Text_Analytics_Toolbox
#product.UAV_Toolbox
#product.Vehicle_Dynamics_Blockset
#product.Vehicle_Network_Toolbox
#product.Vision_HDL_Toolbox
#product.WLAN_Toolbox
#product.Wavelet_Toolbox
#product.Wireless_HDL_Toolbox
M1chaelM commented 3 years ago

Installing with just the products above produces a 20.3 GiB image. I don't yet know whether this makes a significant difference in spin-up time.

M1chaelM commented 3 years ago

I built both the full and trimmed version of this image at put them in a repository with tags to indicate which is which:

I then launched with cloudsim with the following commands to get a rough estimate of the time required to launch:

curl -X POST -H "Private-Token: ${TOKEN}" https://staging-cloudsim-nps.ignitionrobotics.org/1.0/start -F "image=npslearninglab/me4823:matlab_small" -F "name=4823matlabsmall"

This switched to status "waiting for docker image after about 5 minutes, then stayed in that status for over 60 minutes. After 60 minutes I sent a stop call, which immediately changed the state to "removing instance."

curl -X POST -H "Private-Token: ${TOKEN}" https://staging-cloudsim-nps.ignitionrobotics.org/1.0/start -F "image=npslearninglab/me4823:matlab_full" -F "name=4823matlabfull"

The command above had approximately the same behavior. This time I let it run for 2 hours before stopping it.

M1chaelM commented 3 years ago

@tfoote @nkoenig, any reason you would expect the above commands to fail? I thought they would take a while due to the size of the image, but I didn't expect them to take more than an hour, so I'm wondering if something went wrong. Are there restrictions on the location of the source repository, for example? The one I used was intended just for expedience while we're waiting to get our organization set up.

tfoote commented 3 years ago

@M1chaelM I would expect it to work.

I pulled it and validated it works locally with the invocation

docker run --rm -it -p 8080:8080 --gpus all npslearninglab/me4823:matlab_small

M1chaelM commented 3 years ago

@tfoote Thanks for checking that it runs locally. I sent the command to start matlab_small again today using @bsb808 's script to record timing. It is still stuck at the "Waiting for docker image" step, and has been there about 50 minutes. I will reach out to @nkoenig to see whether he can get any more info on what's happening on the backend.

M1chaelM commented 3 years ago

The resolution of this issue was that there is no way to further reduce the size of the Matlab image but by extending the timeout on the backend we can still run it. It takes a 5-10 minutes longer to start up, but I think faster startup time will be a separate feature request, so I'm closing the issue.