This PR adds two new optional parameters to every explainer these are n_steps and internal_batch_size.
Both of these parameters are passed directly into Captum's LIGAttribution instance. n_steps controls how many steps occur from the integration between baseline inputs and actual inputs, in theory the more steps the more stable the explanations, increasing n_steps will increase the time it takes to calculate attributions, the default value in Captum is 50.
internal_batch_size is used to control the batch size of inputs. This feature is particularly useful for instances where transformers-interpret was going OOM due to the amount of gradient accumulation in one step (#51). Lowering internal_batch_size should greatly reduce memory overhead in situations where a single batch of gradients would cause OOM.
Motivation and Context
References issue: # (ISSUE)
Tests and Coverage
Types of changes
[X] Bug fix (non-breaking change which fixes an issue)
[X] New feature (non-breaking change which adds functionality)
[ ] Docs (Added to or improved Transformers Interpret's documentation)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Final Checklist:
[X] My code follows the code style of this project.
PR Description
This PR adds two new optional parameters to every explainer these are
n_steps
andinternal_batch_size
.Both of these parameters are passed directly into Captum's LIGAttribution instance.
n_steps
controls how many steps occur from the integration between baseline inputs and actual inputs, in theory the more steps the more stable the explanations, increasingn_steps
will increase the time it takes to calculate attributions, the default value in Captum is 50.internal_batch_size
is used to control the batch size of inputs. This feature is particularly useful for instances where transformers-interpret was going OOM due to the amount of gradient accumulation in one step (#51). Loweringinternal_batch_size
should greatly reduce memory overhead in situations where a single batch of gradients would cause OOM.Motivation and Context
References issue: # (ISSUE)
Tests and Coverage
Types of changes
Final Checklist: