This PR adds the functionality of setting model_weights to the reward models accordingly to defined values in openvalidators/reward/config.py or flags like --reward.diversity_weights.
Also includes:
adds reward tensor to self. to be used in forward pass reward calculation
modifies boolean tags such --neuron.reciprocate_off to weight flags like --reward.reciprocate_weights
adds RewardModelType enum for key normalization and organization
adds DefaultRewardFrameworkConfig class to define default reward weights
sets Mock reward models to return zeros instead of ones
fix MockDataset naming mismatch bug introduced in latest dataset change
This PR adds the functionality of setting
model_weights
to the reward models accordingly to defined values inopenvalidators/reward/config.py
or flags like--reward.diversity_weights
.Also includes:
self.
to be used inforward
pass reward calculation--neuron.reciprocate_off
to weight flags like--reward.reciprocate_weights
RewardModelType
enum for key normalization and organizationDefaultRewardFrameworkConfig
class to define default reward weightsMockDataset
naming mismatch bug introduced in latest dataset change