agencyenterprise / neurotechdevkit

Neurotech Development Kit (NDK)
https://agencyenterprise.github.io/neurotechdevkit/
Apache License 2.0
117 stars 10 forks source link

Improve memory-usage estimation #140

Open charlesbmi opened 1 year ago

charlesbmi commented 1 year ago

Describe the new feature or enhancement

Please provide a clear and concise description of what you want to add or change. NDK provides a memory-usage estimate that helps the user decide early-on whether the simulation can complete successfully. These are estimates based on an initial internal exploration that could be improved for larger simulations.

One of the NDK-estimated-132-GB simulations ran out of memory on a c5.24xlarge (192 GB memory), which suggests the memory estimates could be improved.

Accurate memory estimation improves users' ability to plan and execute large simulations effectively. By enhancing this feature, we can provide users with more confidence in their AWS instance selection and avoid potential simulation failures due to memory constraints.

Please describe how you would use this new feature. To determine the appropriate AWS instance sizing for running a large NDK simulation.

Describe your proposed implementation

Options:

  1. work with stride developers to get a first-principles memory consumption calculation
  2. run simulations with ram_monitor.py across a wider range of values and use that to improve linear-regression/other model for RAM estimate.

Describe possible alternatives

An alternative approach to improving memory estimates could involve utilizing existing AWS services, such as CloudWatch, to monitor the memory usage of ongoing simulations in real-time. However, this approach might introduce additional complexity and overhead, and it might not be as accurate as an improved internal estimation algorithm tailored specifically for NDK simulations.