Add option to concatenate features for pointer generator

CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning

Apache License 2.0

32 stars 17 forks source link

Add option to concatenate features for pointer generator #136

Open Adamits opened 1 year ago

Adamits commented 1 year ago

I was just thinking, though our pointer-generator implementation(s) take care to encode features separately so that they are not used in the attention distribution for the pointer probabilities, I think it is worth making it easy to just consider features as other input symbols along with the lemmas.

That is, to concatenate the features with the input just like we do for the 'vanilla' seq2seq models. This is just for comparison as sometimes these models learn things on their own without much intervention.

kylebgorman commented 1 year ago

Sure, I have been wondering that too. Is there any way to absolutely forbid them from being copied into the target sequence though? (And wouldn't that give rise to a decoding error if they were bceause their indices would not be in the relevant index?)

Here's my hypothesis though (and maybe this is just my "neat"ness coming out in this admittedly "scruffy" project): concatenation is never substantially better than separate encoders, and it's occasionally worse. If I am right about this, then we could free ourselves of all that code specific to concatenating models, eventually.

Adamits commented 1 year ago

This is a good point. If we implement the option to concat for every model, then we could run this experiment.

kylebgorman commented 1 year ago

"To conatenate or not to concatenate": great SIGMORPHON short paper idea.