[P0] Intervention scheduling for generation

Description

Basic functionality for scheduling interventions to happen on positions not present in the prompt (i.e. generated tokens). Ideally should follow the same procedure for GRU.

Changelog:

timestep_selector, a list of length num_intv of boolean callbacks with signature Callable[[int, torch.Tensor], bool] can be passed to generate() calls. Each intervention calls its callback function with the current position to determine whether the intervention should operate on that position or not.
New handling of Nonevalues in unit locations: If Nones are specified at the batch dimension then interventions are not applied to those examples in the batch.
weird logic where _intervention_getter(), _intervention_setter() functions were being called with single interventions even though they were written to handle an array of intervention keys and return a list of handlers, has been removed
Efficiency and readability improvements in gather_neurons() and scatter_neurons()

Testing Done

Tests added: test_nulling_intervention, test_generation_with_source_intervened_prompt, test_dynamic_static_generation_intervention_parity, test_generation_noops
Tests fixed: test_with_subspace_negative, test_scatter_neurons_gpt2_attn_with_head_positive

Checklist:

[x] My PR title strictly follows the format: [Your Priority] Your Title
[ ] I have attached the testing log above
[x] I provide enough comments to my code
[x] I have changed documentations
[x] I have added tests for my changes

stanfordnlp / pyvene

[P0] Intervention scheduling for generation #110

Description

Testing Done

Checklist: