Closed TheaperDeng closed 2 weeks ago
LGTM as well. I tried the gradients on both linear and convolution layers.
But do we want to also add merging for the parameters for the same layer as well? For example, if one wants to conduct attribution on the first linear layer, it would be handy if we can combine the linear1.weight and linear1.bias together and obtain the concatenated gradients other than getting the separate gradients first and concatenate them later in attribution calculation?
But do we want to also add merging for the parameters for the same layer as well? For example, if one wants to conduct attribution on the first linear layer, it would be handy if we can combine the linear1.weight and linear1.bias together and obtain the concatenated gradients other than getting the separate gradients first and concatenate them later in attribution calculation?
I thought about this as well. Since we are using the named_parameters().keys()
as the valid choices for layer names. I do afraid some times the layer's parameter is not named as xxx.weight
and xxx.bias
, then it may bring some error.
Description
1. Motivation and Context
2. Summary of the change
layer_name
fortask.get_grad_target_func
,task.get_target_func
,task.get_grad_loss_func
, andtask.get_loss_func
(previously it will raise a notimplemented error).TODO: this is the first PR to support partial parameter in dattri. Next PRs will support this feature in high-level attributor APIs.
3. What tests have been added/updated for the change?