Then, embedWithConcat (and maybe data::factored_vocab?) take this into account.
I feel like this is not too good when dealing with factors with very different vocab sizes, for example capitalization of a word (vocab size 3) vs word inflection (vocab size ~100 for some languages). This forces either a too small embedding for the second factor, or a too large embedding for the first, which seems wasteful.
Example
factors-dim-emb should behave like dim-vocabs when --factors-combine=concat
Comments
This seems easy enough to implement
Famous last words
I'd appreciate if somebody with a good knowledge of the codebase would gauge the size of the footgun beforehand.
Feature description
Right now,
factors-dim-emb
takes a single INT. Then, inLayers::Embedding
creates a matrix where every embedding has the same dimension:Then,
embedWithConcat
(and maybedata::factored_vocab
?) take this into account.I feel like this is not too good when dealing with factors with very different vocab sizes, for example capitalization of a word (vocab size 3) vs word inflection (vocab size ~100 for some languages). This forces either a too small embedding for the second factor, or a too large embedding for the first, which seems wasteful.
Example
factors-dim-emb
should behave likedim-vocabs
when--factors-combine=concat
Comments
I'd appreciate if somebody with a good knowledge of the codebase would gauge the size of the footgun beforehand.