Open meg-huggingface opened 5 months ago
It looks like you've updated documentation related to model or dataset cards in this PR.
Some content is duplicated among the following files. Please make sure that everything stays consistent.
huggingface_hub
repo)huggingface_hub
repo)huggingface_hub
repo)The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Thanks @julien-c , completely agree it's a bit heavy and also agree we should keep model cards as light and as generally applicable as possible, appreciate that. I have been waffling on best approaches with respect to 2 things happening at the moment:
To get to your point about applicability to different models: Although the interest in this has grown for generative models, the kinds of issues are applicable to different kinds of models:
So, in light of all of the above: What about creating a section in the Testing/Evaluation part that said "Societal Impact" or "Safety", was optional, and was free text? Then in the Annotated Model Card, we could explain some of the issues that people might check for (privacy, violence, sexualization), but explicitly not have tags for them to make it less "heavy" and circumvent the foreseeable issue were malicious actors search for models that are harmful specifically? The distinction with "Bias, Risks, and Limitations" is that people tend to put very general things there "may produce incorrect information, may be biased wrt skin tone" etc., whereas the "Societal Impact"/"Safety" section would speak directly to what people are now testing for.
Also ping @luciekaffee in case this convo interests her.
yep, sounds like a good trade off to me!
"Societal Impact" or "Safety"
How about Safety Assessment
, so the goal is to encourage the reporting of specific evaluations the model authors may have undertaken? "Societal Impact" might be more accurate, but for a casual reader (me) the distinction with respect to "Bias, Risks, and Limitations" might be a bit blurred. "Safety" may be more generic if we expect authors to also include information about guardrails or similar mechanisms.
"Societal Impact" or "Safety"
How about
Safety Assessment
, so the goal is to encourage the reporting of specific evaluations the model authors may have undertaken? "Societal Impact" might be more accurate, but for a casual reader (me) the distinction with respect to "Bias, Risks, and Limitations" might be a bit blurred. "Safety" may be more generic if we expect authors to also include information about guardrails or similar mechanisms.
That makes a lot of sense to me!
Food for thought: Adding a section that deals directly with Redteaming (as that is becoming a big interest/concern). Called it "Safety Testing", so it's not limited to redteaming and can also include adversarial testing (which some folks in redteaming world make a distinction about). Hit specifically on CSAM, as we have been inspired by Thorn and Rebecca Portnoff to be more proactive about this. For more context, see: https://huggingface.slack.com/archives/C0317KZSB5H/p1712707749145689?thread_ts=1712614250.080999&cid=C0317KZSB5H