WIP - Refactor reward terms params

WIP description - refactoring the params for reward terms. We always use reward terms, the difference now is if we should apply auto-norm now.

The big change is here, and the rest follows from removing some params: https://github.com/SkymindIO/nativerl/commit/5af9d522554fd16dc0c6a7acc3af24d3931d1955#diff-6a83e83402a63a80857e3c3a49bb5120f23e4eac02fc6523dc80fe612dae8754R127-R133

alphas is optional here. The number of reward terms is determined by the env itself, querying reward_terms() use_auto_norm is kept separate and can only be true if the number of terms is > 1

This needs from the webapp side: https://github.com/SkymindIO/pathmind-webapp/pull/3679

Todo:

[ ] code changes
[ ] manually test

PathmindAI / nativerl

WIP - Refactor reward terms params #485