upb-lea / gym-electric-motor

Gym Electric Motor (GEM): An OpenAI Gym Environment for Electric Motors
MIT License
305 stars 69 forks source link

Code Speed Investigation and Improvement #35

Open wallscheid opened 4 years ago

wallscheid commented 4 years ago
wallscheid commented 4 years ago
* Use a profiling tool (e.g. built-in in Pycharm) to investigate on the time consumption for the different toolbox parts for the provided examples or a dummy controller input

* Identify possible potentials for improving the code in terms of processing time

* Evaluate alternative implementations to speed up the code (e.g. on specific test branch)
pramodcm95 commented 4 years ago

The speed analysis has been done for various motor environments. The GEM related code bottlenecks are prioritized. The following steps were followed for the analysis:

Additionally, I have also noticed the earlier optimization efforts using Jacobian. But, need to dig deeper on that.


XyDrKRulof commented 4 years ago

I had a call with Pramod in which he showed me the steps he did for the code investigations. I have no further remarks on this issue.

wallscheid commented 4 years ago

@pramodcm95 : During the last meeting you also showed the structure overview of the built-in Pycharm profiler regarding the computational demand per code part (flowchart style diagram). Could you please add these diagrams as picures to this report issue, I belive they provide a valuable overiew.

pramodcm95 commented 4 years ago

In the issue branch on git hub, I have checked in these files as "___.pstat' files (one for each environment) which needs to be imported to pycharm( in tools->open CProfile snapshot ). This should be the best way to get that graph since we can navigate to respective source file directly.

Here, In this issue, I can post the snapshot if you want. But are you expecting the flowchart snap of each environment and controller combination?, which I am afraid, may be chaotic here. Please let me know if the snapshot is required for all environments. I will do the needful

wallscheid commented 4 years ago

Since we would like to maintain a slim branch structure in the long-run, the issue branch will be deleted at some point. Hence, I believe it would be simplest if you could through the relevant files into an archive folder and upload it here to the issue thread - I believe these reportings are static files which do not require the specific code from which they have been generated, right?

pramodcm95 commented 4 years ago

yes, they have been generated for different Environment-ID's. SO changing an environment would give me a new report.

And I understand the concept now. I am uploading a folder with relevant files for a quick update now. The snapshots will also be posted soon


wallscheid commented 4 years ago

Code profiling delivered some interesting insights into the computational demand of the different parts of the toolbox.

Next steps will be to evaluate possible speed improvements e.g. if parts of the toolbox can be transfered to Cython or Numba to optain a speed-up.

wallscheid commented 4 years ago

@pramodcm95 I believe with the above zip archive the results of your analysis are saved within this issue thread also for later inspection & discussion. For repo clearness I would like to delete the branch attached to this issue - do you object?

pramodcm95 commented 4 years ago

No @wallscheid . It is good to remove it from repo.

wallscheid commented 4 years ago

@pramodcm95 : could you please provide a short update on your working status? The supervisors missing a lifesign.

pramodcm95 commented 4 years ago

Yes. Initially I had 2 options to explore: 1) cython - which requires all files to be converted to .pyx format and compile it from an external C compiler. This resulted in loads of errors and upon further seep investigation, Realized that cython GitHub already has lots of issues posted and working on it.

So simpler option would be to switch to numba, which takes higher time for the first compilation and then on depending on the type of code, it speeds up the process.

2) Numba: This basically uses JIT(just in time compiler) The following libraries from numba have been explored and tried to integrate in project:

@jitclass - compiles whole class using JIT compiler @jit _ simplest veersion, which gives us warnings if prerequisites of numba are not met @njit - called "no python mode" completes runs on numba style @njit(parallel = True) - parallelise the code, if the above @numba conditions are satisfied @vectorise - vectorises the code.

As all these have lots of pre-requisites, the idea to attack the function taking a lots of time was an ambitious task, since it has lots of interlinked functions which cannot be converted to numba straightaway. I am stuck at a certain point, where adding a numba decorator is asking an extra argument in certain functions and doesn't behave similar for other functions. Reading through their repository to find any such issues reported before.

pramodcm95 commented 4 years ago

I could execute mechanical_ode (in mechanical loads) in object mode of numba. Unfortunetly there isn't any great improvement. Instead, the own time of a function is increasing.


( to the left - without numba, to the right - with numba )

wallscheid commented 4 years ago

@pramodcm95 : could you please upload your altered code to a dedicated feature branch such that also others can have a view on it.

Regarding the result: are there any statements from the numba project or numba users that initial calls of numba code is leading to quite a significant overhead? Maybe the computional load of the code you have exchanged is very small while this presumed overhead is large?

pramodcm95 commented 4 years ago

sure. So i will create a new branch for this issue now?(since we had deleted the issue 35 branch before)

yes, Numba works that way. It has a very high initial overhead, once the all the values are cached it starts to reduce the execution time

A simple example looks like this: image

wkirgsn commented 3 years ago

Maybe this package could help to outsource performance-critical code into c code:
