Meaning of task_batch_size_energy and task_batch_size_forces

xyttyxy commented 3 years ago

The question What is the meaning of these parameters in the input.nn file?

What did you already try? We observe that quality of energy and forces are significantly affected by these. Reading JCTC 2019, 15, 3075-3092 seem to suggest this is similar to the mini-batches used for benchmarking the SGD and Adam method. In the context of MS-EKF the batch size is determined by the number of streams, so I do not understand the use of these parameters.

singraber commented 3 years ago

Hi,

Sorry for the late reply! Yes indeed the parameters task_batch_size_energy and task_batch_size_forces are used to create mini-batches for SGD and Adam. If you use MS-EKF they are usually set to 1 because, as you already mentioned, the parallel Kalman filter training comes with its own way of mini-batching training "patterns" (energies and forces) via MPI tasks. In contrast to SGD or Adam the MS-EKF does not just sum or average over mini-batches but rather extends the linear algebra expressions of the serial KF algorithm (as described in the paper). If you use numbers > 1 for MS-EKF it will perform additional "SGD-style" mini-batching on each MPI task which is probably not a desired mode of operation.

What did you try exactly and what where the results?

Best, Andreas

vsumaria commented 3 years ago

Hello @singraber (I am from the same research group as @xyttyxy ) I tried changing the task_batch_size_forces from 1 to 10, and it significantly reduced the per epoch time during the training. I wanted to understand how to choose this value to optimize the training time.

CompPhysVienna / n2p2

Meaning of task_batch_size_energy and task_batch_size_forces #111