Help me understand the mention scorer...!

fairy-of-9 commented 2 years ago

I had a question while looking at your paper and code.

I have been understanding that the mention scorer of e2e-coref gives a one-dimensional value(scalar score) to each span. But, here you give 3000 dimensions to the mention score. (According to config).

Can you tell me if I misunderstand in your paper or code? Thanks!

pitrack commented 2 years ago

Thanks for the great question!

The code is a bit convoluted here. The tl;dr is that for an input embedding x (dim=3092 for large models), we're essentially doing x_1 = relu(W_1 x+b) for W_1 with size (3000, 3092) and b size (3000). Then we're returning (W_2)(x_1) where W_2 has size (1, 3000).

Line by line (comparing with the Joshi et al. repo which is essentially the same as e2e model):

x has dim 3092: ours vs. theirs

relu(W_1 x + b): ours: note FFNN is a wrapper which calls util.FFNN and theirs is also a wrapper calling an even more general util.FFNN

You've observed that our W_1 has dim (3092, 3000). Theirs does too.

Finally, both models have a projection. Note that when the wrapper is called, both functions hardcode/pass in output_dim=1 or output_size=1. This triggers a conditional projections for ours but and it's a mandatory argument for theirs.

Anyway, I hope this clears things up and let me know if you have more questions. Thanks for making me check because when I read your question I also forgot about this and was confused/worried too 😅

fairy-of-9 commented 2 years ago

Everything has become clear!! Thank you for providing good answer, paper and codes 😀

pitrack / incremental-coref

Help me understand the mention scorer...! #5