Closed hxhcreate closed 7 months ago
Hi there, could you please provide more information, such as whether your question is about Llama Guard or CyberSecEval, and what exact script you are looking for? Thanks.
Which script are you looking for?
Hi, if you're looking for Llama Guard evaluation, Llama recipes has a script for running inference. We then use sklearn's precision_score, recall_score, f1_score, average_precision_score to compute the metrics. Is this what you're looking for?
I will close this issue, but please reopen if you have further questions. Thanks!
I was following this work. It would be greatly appreciated if you could release the evaluation code to help us reproduce your results!