Feature/reagent branch for ReAGent

casszhao commented 9 months ago

ReAGent, for a Model-agnostic Feature Attribution Method for Generative Language Models

Paper link: https://arxiv.org/abs/2402.00794

Type of Change

🚀 New feature (non-breaking change which adds functionality)

gsarti commented 9 months ago

Hey there, thanks a lot for the PR @casszhao @xuan25 !

I had a look at the code structure you pushed and I have some comments on the current implementation:

It's great that you separated all components in separate modules, but to keep it consistent with the rest of the library I'd like to have one file per module type containing all relevant classes (e.g. a file stopping_condition.py with the base class, DummyStoppingConditionEvaluator and TopKStoppingConditionEvaluator) to avoid excessive nesting. You can see an example in the token_sampler.py file I created in the reagent_core folder.
I changed the id to lowercase (reagent) and the class name to Reagent to enforce the style we're using throughout the library.
Currently there is a single class used per component type (e.g. POSTagTokenSampler for samplers), but all the other classes are also included. If the method is meant to be used with a specific class, then we do not need the other ones. If you believe it would be worth keeping the user options open, there should be a parameter allowing the user to pick a modality.
Having a look at sampler classes, I realized that the current implementation probably does not support source and target-side context (as in encoder-decoder models), given there is a single set of input ids that is passed on. For a release in Inseq we do want both encoder-decoder and decoder-only support.
I was puzzled about the InferentialMTokenSamplerclass since in its current state (as shown in the __main__ function) loads an encoder model with an untrained language modeling head and uses it for sampling. I imagine this is not the desired behavior, but I'm not sure if you'd want an AutoModelForMaskedLM with the encoder or an AutoModelForCausalLM with a regular decoder-only model in its place.
Not sure we need neither serializing.py nor the Traceable class, given that the outputs will already be provided in the format supported by Inseq.

I will keep having a look in the next week, if you could address some of the issues I describe up here it would be of great help!

gsarti commented 9 months ago

I've had a look at the changes and we're much closer to get it merged @xuan25, thanks! 🤗

Will have a second look to fix minor details. Just a quick Q:

I've seen that currently you raise an exception when an encoder-decoder with attribute_target is used. In principle that would be fine, but would there be a way to support ReAGent perturbations on target-side prefixes? Intuitively, this shouldn't be too problematic to adapt, right? I think the only potential pain point would be to sample tokens from the correct vocabulary, since Enc-Dec models may have two (see e.g. https://huggingface.co/Helsinki-NLP/opus-mt-en-zh/tree/main)

xuan25 commented 9 months ago

I've had a look at the changes and we're much closer to get it merged @xuan25, thanks! 🤗

Will have a second look to fix minor details. Just a quick Q:

* I've seen that currently you raise an exception when an encoder-decoder with `attribute_target` is used. In principle that would be fine, but would there be a way to support ReAGent perturbations on target-side prefixes? Intuitively, this shouldn't be too problematic to adapt, right? I think the only potential pain point would be to sample tokens from the correct vocabulary, since Enc-Dec models may have two (see e.g. https://huggingface.co/Helsinki-NLP/opus-mt-en-zh/tree/main)

Thanks for the feedback. Yeah, I have implemented the attribute target in the latest commits, but without limiting the sampled token within the same language (vocabulary set). However, it should do the job of at least perturbating the inputs.

gsarti commented 7 months ago

Hi @xuan25 @casszhao, sorry for the delay! I'm having a look at this and ran with no issues using decoder-only and encoder-decoders (both with and without target attribution), so I think we are quite close to merging now. I pushed some fixes including:

setting NLTK as an optional dependency for inseq, matching the logic in the token sampler
slight variable renaming to make some logics more evident for the user (top_n and top_n_ratio are now called keep_top_n and keep_ratio to make it clear that tokens are kept)

xuan25 commented 7 months ago

Hi @xuan25 @casszhao, sorry for the delay! I'm having a look at this and ran with no issues using decoder-only and encoder-decoders (both with and without target attribution), so I think we are quite close to merging now. I pushed some fixes including:

setting NLTK as an optional dependency for inseq, matching the logic in the token sampler

slight variable renaming to make some logics more evident for the user (top_n and top_n_ratio are now called keep_top_n and keep_ratio to make it clear that tokens are kept)

Thanks, @gsarti !

gsarti commented 7 months ago

We are ready for merging! 🎉 Just noting down here some points in the current that can be improved regarding the current ReAGent implementation:

overlap_strict_pos currently defaults to True, and the False condition is in TODO. If it's added, the purpose of this check needs to be made more explicit in docstrings.
The AggregateRationalizer class currently supports only a batch size of 1 because it builds a batch of various masked examples using num_probes. Ideally, we'd want batching to still be allowed here, taking inspiration from the Captum Integrated Gradients implementation where they face the same issue (theinternal_batch_size there is equivalent to num_probes, and it is used to build the interpolation steps across all batch elements.
Currently the ReAGent implementation doesn't make use of attributed_fn to specify what step function to use to estimate token importance, and always uses the logit. It would be good to use the attribution_model forward instead of extracting the underlying AutoModel, since it would automatically handle this and allow for out-of-the-box usage for e.g. contrastive feature attribution, or other user-specified step functions.
Most attribution methods use attribution_args for parametrizing the attribution run at inference time, unless there is some expensive overhead that can be spared on repeated calls, in which case the parameters should be passed to __init__. In ReAGent, all parameters are passed upon __init__. My suggestion would be to keep the POSTaggingSampler instantiation upon init, since it might need to build the POS tags map, but move the rest as argument of model.attribute.

Provided the current implementation is functional for both decoder-only and encoder-decoder models, I will proceed with the merge and any further development regarding these issues should be performed in a dedicated PR. Thanks again @casszhao @xuan25 for your contribution! 😄

casszhao commented 7 months ago

Hi Thanks, will promote it later on X. Cheers ~

Best Regards

Cass Z linkedin.com/in/casszhao M: 44 7516 862694

On Sat, Apr 13, 2024 at 11:00 Gabriele Sarti @.***> wrote:

We are ready for merging! 🎉 Just noting down here three points in the current that can be improved regarding the current ReAGent implementation:

overlap_strict_pos currently defaults to True, and the False condition is in TODO. If it's added, the purpose of this check needs to be made more explicit in docstrings.

The AggregateRationalizer class currently supports only a batch size of 1 because it builds a batch of various masked examples using num_probes. Ideally, we'd want batching to still be allowed here, taking inspiration from the Captum Integrated Gradients implementation https://captum.ai/api/_modules/captum/attr/_core/integrated_gradients.html#IntegratedGradients where they face the same issue (theinternal_batch_size there is equivalent to num_probes, and it is used to build the interpolation steps across all batch elements.

Currently the ReAGent implementation doesn't make use of attributed_fn to specify what step function to use to estimate token importance, and always uses the logit. It would be good to use the attribution_model forward instead of extracting the underlying AutoModel, since it would automatically handle this and allow for out-of-the-box usage for e.g. contrastive feature attribution, or other user-specified step functions.

Provided the current implementation is functional for both decoder-only and encoder-decoder models, I will proceed with the merge and any further development regarding these issues should be performed in a dedicated PR. Thanks again @casszhao https://github.com/casszhao @xuan25 https://github.com/xuan25 for your contribution! 😄

— Reply to this email directly, view it on GitHub https://github.com/inseq-team/inseq/pull/250#issuecomment-2053596768, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIFSOHH6EJCBWLHSIOIARWDY5D64RAVCNFSM6AAAAABDPLAO26VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTGU4TMNZWHA . You are receiving this because you were mentioned.Message ID: @.***>

inseq-team / inseq

Feature/reagent branch for ReAGent #250

Type of Change