Items for new categories

Red-teaming:

ACL 2024 Tutorial: Vulnerabilities of Large Language Models to Adversarial Attacks
CDAO frameworks, guidance, and best practices for AI test & evaluation
ChatGPT_system_prompt
DAIR prompting guide, risks and misuses
Extracting Training Data from ChatGPT
Frontier Model Forum: What is Red Teaming?
Identifying and Eliminating CSAM in Generative ML Training Data and Models
Microsoft: Microsoft AI Red Team building future of safer AI
OpenAI Red Teaming Network
Red Teaming of Advanced Information Assurance Concepts
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

Cheatsheets and infographics:

jphall663 / awesome-machine-learning-interpretability