patronus-ai / copyright-evals

https://www.patronus.ai/blog/introducing-copyright-catcher
Creative Commons Zero v1.0 Universal
18 stars 1 forks source link

Evaluating Copyright Violations in LLMs

This repository contains the test set, scripts and results for the analysis on the copyrighted works reproduction of open sourced and closed source models. For more details, please read our blog post.

Contact: If you have questions about this work, you can email us at contact@patronus.ai


Dataset Overview: The provided open-source dataset (n=100) consists of the following attributes:

    - Book Name:  name of the book  
    - Author:  corresponding author
    - Prompt:  constructed adversarial prompt

Our dataset consists of 50 first passage prompts and 50 completion prompts. The model generations and labels are available in the results folder.