dd1497 / llm-unmasking

Code for the paper Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling accepted at ACL 2024 Findings
MIT License
4 stars 0 forks source link

Thank you for interesting work. When will you open the source code? #2

Open leejun-ha opened 3 months ago

leejun-ha commented 3 months ago

Your work is very interesting. When will your code be open to the public? And I understand you use spaCy dependency parser for RDRR. But, I am also curious about the detailed procedure for calculating RDRR. Could you share the detailed procedure of calculating RDRR? Thank you.

dd1497 commented 3 months ago

Thank you :)) the code is now available, you can find the details for RDRR calculation in this notebook notebooks/Dep_Parse_Dataset_Characterization.ipynb

leejun-ha commented 3 months ago

Thank you for your code sharing!

I hope you don't mind me asking, but I'm finding this code quite complex. If it's not too much trouble, would you be so kind as to provide an explanation?

I would be immensely grateful if you could break down its main components and functionality. Your expertise would be invaluable in helping me understand it better.

Additionally, I was wondering if you could also share the model checkpoints, if possible.

Thank you so much for your time and consideration. :)

dd1497 commented 2 weeks ago

Hello, the code does dependency parsing on each sentence from the dataset. Dependency parsing is done using the Spacy library. The metric RDRR is a characterization metric for the dataset that tells us how much the annotations of named entity/event/sentiment mentions depend on the right context. This is important because LLMs, by default, do not see the right context (to be able to generate text token by token).

For RDRR, we just calculate how many right-side dependencies there are for each sentence in the training set and how many left-side dependencies. Here, we count only the dependencies of the annotated named entity/event/sentiment mentions in each sentence. The right side counts are normalized over the total sum of left and right side dependencies to obtain RDRR. If this number is between 0.5 and 1.0 then this signals that the annotated mention spans might depend more on the right than on the left context.

The checkpoints are too big since there were a lot of experiments here. But if you still need it, I could transfer them directly. Just send me an email at david.dukic@fer.hr and we'll find a way to do that :)