issues
search
jphall663
/
awesome-machine-learning-interpretability
A curated list of awesome responsible machine learning resources.
Creative Commons Zero v1.0 Universal
3.53k
stars
579
forks
source link
Items for new categories
#283
Closed
jphall663
closed
4 months ago
jphall663
commented
5 months ago
Red-teaming:
ACL 2024 Tutorial: Vulnerabilities of Large Language Models to Adversarial Attacks
CDAO frameworks, guidance, and best practices for AI test & evaluation
ChatGPT_system_prompt
DAIR prompting guide, risks and misuses
Extracting Training Data from ChatGPT
Frontier Model Forum: What is Red Teaming?
Identifying and Eliminating CSAM in Generative ML Training Data and Models
Microsoft: Microsoft AI Red Team building future of safer AI
OpenAI Red Teaming Network
Red Teaming of Advanced Information Assurance Concepts
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Cheatsheets and infographics:
Machine Learning Attack_Cheat_Sheet
Different types of AI:
https://fpf.org/wp-content/uploads/2021/01/FPF_AIEcosystem_illo_03.pdf
(new)
Red-teaming:
Cheatsheets and infographics: