danchern97 / tda4atd

This is an official repository for "Artificial Text Detection via Examining the Topology of Attention Maps" presented at EMNLP 2021 conference.
22 stars 3 forks source link

Question on the number of heads used in the analysis #2

Open sehunfromdaegu opened 2 months ago

sehunfromdaegu commented 2 months ago

Thank you for sharing the code. I'm confusing something, I would appreciate if my understanding is correct.

  1. Are you using the all heads output for the analysis? The paper you mentioned 'Roles and Utilization of Attention Heads in Transformer-based Neural Language Models' appears to use only selected heads. But your code seems to use all heads. Is this correct?

  2. After extracting all the features, are they concatenated and used as an input for a single linear binary classifier? If they are concatenated, then the dimension of it would be quite large I guess.

SilverSolver commented 2 weeks ago

Hello. Sorry very much for a late response.

  1. Yes, it is correct.
  2. Yes, they are concatenated. But it's okay because we use regularization in our logistic regression, so it works even if the amount of features is larger than the amount of examples in the train set.