TunnRL / TunnRL_TBM_maintenance

Working repository for the code of the TunnRL TBM project
MIT License
1 stars 0 forks source link

Increase training speed by testing different techniques #17

Open tfha opened 2 years ago

tfha commented 2 years ago
  1. Profiling code for time with profiling techniques to find the parts to optimize (cProfiler etc.): https://machinelearningmastery.com/profiling-python-code/
  2. MKL:
  3. Running on NGI Odin Machine
  4. Running on Azure cloud (if its available for use in our NGI Azure cloud) or AWS/Google cloud etc.
  5. Sharing database and optuna optimize from many machines against the same database-files
  6. Make it possible to automatically kick off the number of processes as the number of cpus on a machine, ie. parallelization:
tfha commented 2 years ago

Number 3, 5 and 6 are now tested and ok.

tfha commented 2 years ago

Point 2 is already implemented in existing libraries

tfha commented 2 years ago

A report from profiling the train_agent method. This highlights a number of processes to handle.

Some thoughts of signifcant processes that takes time:

tfha commented 2 years ago

I have tested running on Azure but this did not work properly and I will try ones more with some new advices.