Closed xiafeng-nb closed 2 weeks ago
Hello! You should be good to follow the label in the prompt, as that is the one the model will use. In NeMo Curator, that has "Sexual" as O2 like you mentioned.
Additionally, the model was trained with the label assignments shuffled for each sample. So, you could theoretically change the prompt to assign whichever label to O2 and it should be robust to it. Please reopen the issue if you have any more questions.
Regarding the issue of AEGIS model labels, in https://huggingface.co/nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0 On the Model Card, 'Sexual' is labeled as' 09 ', but in the code and the prompts in the paper,' Sexual 'is labeled as' 02'. Which one should I follow