How to access attentions matrix for MarianMT?

DaveTJones commented 11 months ago

Question

Hey, I've been trying to access the attentions output by the MarianMT like so (please excuse the unorthodox config argument, tidying up is next on my todo list):

  const model_name = "Xenova/opus-mt-en-fr";
  const tokenizer = await MarianTokenizer.from_pretrained(model_name, {
    config: {
      output_hidden_states: true,
      output_attentions: true
    }
  })
  const tokens = (await tokenizer(text)).input_ids;
  const model = await MarianMTModel.from_pretrained(model_name, {
    config: {
      model_type: 'marian',
      is_encoder_decoder: true,
      _name_or_path: 'Helsinki-NLP/opus-mt-en-fr',
      _num_labels: 3,
      activation_dropout: 0,
      activation_function: 'swish',
      add_bias_logits: false,
      add_final_layer_norm: false,
      architectures: ['MarianMTModel'],
      attention_dropout: 0,
      bad_words_ids: [[Array]],
      bos_token_id: 0,
      classif_dropout: 0,
      classifier_dropout: 0,
      d_model: 512,
      decoder_attention_heads: 8,
      decoder_ffn_dim: 2048,
      decoder_layerdrop: 0,
      decoder_layers: 6,
      decoder_start_token_id: 59513,
      decoder_vocab_size: 59514,
      dropout: 0.1,
      encoder_attention_heads: 8,
      encoder_ffn_dim: 2048,
      encoder_layerdrop: 0,
      encoder_layers: 6,
      eos_token_id: 0,
      forced_eos_token_id: 0,
      gradient_checkpointing: false,
      id2label: { '0': 'LABEL_0', '1': 'LABEL_1', '2': 'LABEL_2' },
      init_std: 0.02,
      label2id: { LABEL_0: 0, LABEL_1: 1, LABEL_2: 2 },
      max_length: 512,
      max_position_embeddings: 512,
      normalize_before: false,
      normalize_embedding: false,
      num_beams: 4,
      num_hidden_layers: 6,
      pad_token_id: 59513,
      scale_embedding: true,
      share_encoder_decoder_embeddings: true,
      static_position_embeddings: true,
      transformers_version: '4.34.0.dev0',
      use_cache: true,
      vocab_size: 59514,
      output_hidden_states: true,
      output_cross_attentions: true,
      output_attentions: true
    }
  })
  const translated = await model.generate(tokens)
  const result = tokenizer.decode(translated[0], { skip_special_tokens: true })
  console.log((await model.getAttentions(translated)))

I'm then getting the following error when I run the code:

Error:output_attentionsis true, but the model did not produce cross-attentions. This is most likely because the model was not exported withoutput_attentions=True.

I've looked around but haven't been able to find out what is meant by the reference to exporting the model. How would I go about fixing this?

xenova commented 11 months ago

Hi there 👋 You can create a custom OnnxConfig in optimum to enable the attention matrices. See here for more information: https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models

DaveTJones commented 11 months ago

Brilliant, thankyou!

huggingface / transformers.js

How to access attentions matrix for MarianMT? #516

Question