Striveworks / valor

Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
https://striveworks.github.io/valor/
Other
38 stars 4 forks source link

Allow users to pre-populate joint_df for the Valor text generation streaming manager #762

Closed bnativi closed 2 months ago

bnativi commented 2 months ago

Previously, when the streaming manager was specified, even if the user passed in a joint_df, it would be overwritten with a new empty dataframe with the correct columns. Now, a new empty dataframe is only initialized if no joint_df or an empty joint_df is passed in. The joint_df is also validated, checking that the datum_uids are not null and unique, that either prediction_text or prediction_context_list is not null, and that all the metric values are not null.

Testing was added in the streaming manager functional test for correct and incorrect joint_df initialization.

Also, edge case return values for Bias and Toxicity were changed from 0 to 0.0, due to a type error that can occur. These metric values are expected to be floats.