Add Attack / High-risk Datasets [Manuj/Dom] - Githubissues

domenicrosati / training-time-domain-authorization

0 stars 1 forks source link

Add Attack / High-risk Datasets [Manuj/Dom] #1

Open domenicrosati opened 3 months ago

domenicrosati commented 3 months ago

Issue

Implement Benchmark datasets from high risk domains

ToDo:

All of these items are finished when the code is added for each of these as a BenchmarkDataset

[ ] Implement BeaverTails (dom)
[ ] Implement MedQA (dom)
[ ] Implement LegalStackExchange (Manuj)
[ ] Implement FinQA (Manuj)
[ ] Implement WDMP (Manuj)
[ ] Implement HumanEval (Manuj)
[ ] Implement ToolBench (Manuj)
[ ] Run baselines on these dataset (Dom)

Notes

From https://docs.google.com/spreadsheets/d/1kpzYLORsLMAMTLYZUZ24EgSxapHGCIt2vD0VtXI8nas/edit?gid=0#gid=0

The issue is to implement the https://github.com/domenicrosati/training-time-domain-authorization/blob/9d42629d45ce13b8ac81ec79fe017b6f05d3736e/training_time_domain_authorization/datasets/datasets.py#L5 class for the datasets in that spreadsheet that seem relevant.

There is an example of this in GEM benchmark: https://github.com/domenicrosati/training-time-domain-authorization/blob/9d42629d45ce13b8ac81ec79fe017b6f05d3736e/training_time_domain_authorization/datasets/gem.py

Each dataset that is added should at least be smoke tested to see if evaluation and training works.