issues
search
msakarvadia
/
llm_bias
Investigating if we can find circuits in LLMs that reinforce human-biases found in training data
MIT License
0
stars
0
forks
source link
Training Masks
#8
Open
msakarvadia
opened
4 months ago
msakarvadia
commented
4 months ago
[x] Initialize mask's with a phrase (e.g. I am a woman).
[x] Train them to move away from a specific demographic group
msakarvadia
commented
4 months ago
[ ] Need to ensure that the LR is lower
[ ] Need to ensure that the mask is appended to comment rather than entire prompt
[ ] Need to Check what non-optimized model performance is with the initialized mask