shauli-ravfogel / nullspace_projection

MIT License
87 stars 21 forks source link

how were the deepmoji vectors generated #1

Closed xyzhangfred closed 4 years ago

xyzhangfred commented 4 years ago

Hi, thanks for this great work! I have some confusion with regard to the deepmoji vectors which seem to be 2304 dim vectors. Could you by any chance provide the code for generating these vectors and the raw data? Or could you explain the process of obtaining these vectors? Thanks!

yanaiela commented 4 years ago

Hey, thanks for your interest in our work.

What exactly did you find confusing? The DeepMoji vectors are of size 2304, which we simply use an MLP in order to reduce them to a lower dimension. The generation process is very simple, it's just using the DeepMoji encoder (using hugging face). I added the script for doing that now to the notebooks dir.

Note however, that this won't work out-of-the-box, as we use data from a previous paper of ours (and code).

Unfortunately, we cannot share the data as it's twitter data, but we provide in that repo a script to download it. (and this is why we supply in this work the already encoded vectors of these texts using DeepMoji).

I hope this helps. Let us know if you have further questions

xyzhangfred commented 4 years ago

Thanks for the quick and detailed reply! I was just wondering how to recreate the data generating process, and your answer is very helpful!

yanaiela commented 4 years ago

by data generating process you mean encoding the texts?

This step is covered by the new notebook I just uploaded and linked to in the previous comment. The rest (learning the MLP to predict 'sentiment' is already included in the codebase and is covered by this script (the deep_moji.jsonnet config file)

xyzhangfred commented 4 years ago

Yes, I meant the encoding process. I will check out the new notebook and the code from your previous paper, thanks!