Measuring and benchmarking the safety of the fine-tuned models

redhat-et / datascience-wg

This repository will be used to track all the work that comes out of the Data Science Working Group

1 stars 3 forks source link

Measuring and benchmarking the safety of the fine-tuned models #5

Open fcanogab opened 1 month ago

fcanogab commented 1 month ago

There are different frameworks to measure and benchmark against other models the safety/harmfulness of a fine-tuned model. For example, MLCommons defines a framework that can be used for this.

hemajv commented 2 weeks ago

Thanks for bringing this up! I think this is a worthwhile exercise for us to try and evaluate this benchmark. Looks like the benchmark is still in POC, but they have a repo with steps outlined on how to test it out: https://github.com/mlcommons/modelbench

hemajv commented 2 weeks ago

Is this something you might have the bandwidth to try/look into @fcanogab?

erikerlandson commented 1 week ago

we might also look at unitxt (an ibm open source project)

Jonathan Bnayahu has added some safety related benchmarks and others, see this search for list:

https://github.com/IBM/unitxt/issues?q=author%3Abnayahu+

fcanogab commented 2 days ago

@hemajv, yes, I would like to try to work on this myself.

Thanks for the hint @erikerlandson. I'll take a look at it.