Closed huangkaipeng4399 closed 2 months ago
Thank you for your question! The masking-based implementation is just one approach to calculating gradients solely based on the output “Sure” within the current framework. Other implementations may also work using the same logic. Thanks!
Got it. Thanks again!
Hello! Sorry to bother, but I have a problem to consult you about the code.
When reading your code, I found that in code/find_critical_parameters.py, when you specifying the label of the input text, you use -100 to mask the preceding tokens of "sure". I want to ask why not just specify the label as "sure"? What's the pro of the way your code takes?
Thanks!!