Code Speed Investigation and Improvement

wallscheid commented 4 years ago

Use a profiling tool (e.g. built-in in Pycharm) to investigate on the time consumption for the different toolbox parts for the provided examples or a dummy controller input
Identify possible potentials for improving the code in terms of processing time
Evaluate alternative implementations to speed up the code (e.g. on specific test branch)

wallscheid commented 4 years ago

* Use a profiling tool (e.g. built-in in Pycharm) to investigate on the time consumption for the different toolbox parts for the provided examples or a dummy controller input

* Identify possible potentials for improving the code in terms of processing time

* Evaluate alternative implementations to speed up the code (e.g. on specific test branch)

The profiling of course requires to interact with the toolbox either by inserting dummy actions or based on the already available RL/control examples. Nevertheless, the investigation should only focus on the calculation speed of the GEM part and not cover any external packages.
The speed analysis obsiously has to cover different applications within the toolbox i.e. different motors (and inverters) as well as the disctinction between continuous and finite actions. Depending on the specific scenarios, the results of the investigation may change since other parts of the toolbox are actively used.
For the first sprint of the summer semester 2020, the issue is mainly on the analysis itself to identify possible code bottlenecks and to suggest & try out first improvements. The specific implementations are likely to be covered in the subsequent sprints.

pramodcm95 commented 4 years ago

The speed analysis has been done for various motor environments. The GEM related code bottlenecks are prioritized. The following steps were followed for the analysis:

A sanity check of code coverage while profiling each environment to substantiate covering the entire GEM part
Code profiling has been done using pycharm's Cprofile which gives both the statistical and graphical representation of profiled report
It is found that the majority of the time was utilized by an advanced non-gem frameworks like keras, tensorflow and libraries like scipy and others.
A rigorous investigation established that most of the time in GEM part was spent on following python files( kindly look at the attached report for function level details)
1. physical_systems.py
2. mechanical_loads.py
3. electric_motors.py
4. motor_dashboard.py
5. solvers.py etc
Improvement suggestions: At first sight, using python code optimization using cython, numba, pythron seems to fit the cause. Although majority of modules are already using libraries which has such optimization in-built. It would be interesting to see where this would lead us and to what extent can we optimise

Additionally, I have also noticed the earlier optimization efforts using Jacobian. But, need to dig deeper on that.

code_profiling_results.xlsx

XyDrKRulof commented 4 years ago

I had a call with Pramod in which he showed me the steps he did for the code investigations. I have no further remarks on this issue.

wallscheid commented 4 years ago

@pramodcm95 : During the last meeting you also showed the structure overview of the built-in Pycharm profiler regarding the computational demand per code part (flowchart style diagram). Could you please add these diagrams as picures to this report issue, I belive they provide a valuable overiew.

pramodcm95 commented 4 years ago

In the issue branch on git hub, I have checked in these files as "___.pstat' files (one for each environment) which needs to be imported to pycharm( in tools->open CProfile snapshot ). This should be the best way to get that graph since we can navigate to respective source file directly.

Here, In this issue, I can post the snapshot if you want. But are you expecting the flowchart snap of each environment and controller combination?, which I am afraid, may be chaotic here. Please let me know if the snapshot is required for all environments. I will do the needful

wallscheid commented 4 years ago

Since we would like to maintain a slim branch structure in the long-run, the issue branch will be deleted at some point. Hence, I believe it would be simplest if you could through the relevant files into an archive folder and upload it here to the issue thread - I believe these reportings are static files which do not require the specific code from which they have been generated, right?

pramodcm95 commented 4 years ago

yes, they have been generated for different Environment-ID's. SO changing an environment would give me a new report.

And I understand the concept now. I am uploading a folder with relevant files for a quick update now. The snapshots will also be posted soon

code_profiling_results.zip

wallscheid commented 4 years ago

Code profiling delivered some interesting insights into the computational demand of the different parts of the toolbox.

Next steps will be to evaluate possible speed improvements e.g. if parts of the toolbox can be transfered to Cython or Numba to optain a speed-up.

wallscheid commented 4 years ago

@pramodcm95 I believe with the above zip archive the results of your analysis are saved within this issue thread also for later inspection & discussion. For repo clearness I would like to delete the branch attached to this issue - do you object?

pramodcm95 commented 4 years ago

No @wallscheid . It is good to remove it from repo.

wallscheid commented 4 years ago

@pramodcm95 : could you please provide a short update on your working status? The supervisors missing a lifesign.

pramodcm95 commented 4 years ago

Yes. Initially I had 2 options to explore: 1) cython - which requires all files to be converted to .pyx format and compile it from an external C compiler. This resulted in loads of errors and upon further seep investigation, Realized that cython GitHub already has lots of issues posted and working on it.

So simpler option would be to switch to numba, which takes higher time for the first compilation and then on depending on the type of code, it speeds up the process.

2) Numba: This basically uses JIT(just in time compiler) The following libraries from numba have been explored and tried to integrate in project:

@jitclass - compiles whole class using JIT compiler @jit _ simplest veersion, which gives us warnings if prerequisites of numba are not met @njit - called "no python mode" completes runs on numba style @njit(parallel = True) - parallelise the code, if the above @numba conditions are satisfied @vectorise - vectorises the code.

As all these have lots of pre-requisites, the idea to attack the function taking a lots of time was an ambitious task, since it has lots of interlinked functions which cannot be converted to numba straightaway. I am stuck at a certain point, where adding a numba decorator is asking an extra argument in certain functions and doesn't behave similar for other functions. Reading through their repository to find any such issues reported before.

pramodcm95 commented 4 years ago

I could execute mechanical_ode (in mechanical loads) in object mode of numba. Unfortunetly there isn't any great improvement. Instead, the own time of a function is increasing.

( to the left - without numba, to the right - with numba )

wallscheid commented 4 years ago

@pramodcm95 : could you please upload your altered code to a dedicated feature branch such that also others can have a view on it.

Regarding the result: are there any statements from the numba project or numba users that initial calls of numba code is leading to quite a significant overhead? Maybe the computional load of the code you have exchanged is very small while this presumed overhead is large?

pramodcm95 commented 4 years ago

sure. So i will create a new branch for this issue now?(since we had deleted the issue 35 branch before)

yes, Numba works that way. It has a very high initial overhead, once the all the values are cached it starts to reduce the execution time

A simple example looks like this:

wkirgsn commented 3 years ago

Maybe this package could help to outsource performance-critical code into c code:

https://github.com/mypyc/mypyc

upb-lea / gym-electric-motor

Code Speed Investigation and Improvement #35