The ModelPart equality condition for ModelPart.reuse is too strict

ufal / neuralmonkey

An open-source tool for sequence learning in NLP built on TensorFlow.

BSD 3-Clause "New" or "Revised" License

410 stars 104 forks source link

The ModelPart equality condition for ModelPart.reuse is too strict #775

Closed varisd closed 5 years ago

varisd commented 5 years ago

IMHO, the restriction to the same object type at: https://github.com/ufal/neuralmonkey/blob/master/neuralmonkey/model/model_part.py#L38 is way too strict.

I can imagine a scenario, where I would like to reuse the variables, e.g. between TransformerEncoder and TransformerDecoder object (when the former is used as a classic Transformer encoding and the latter is used for defining additional LM loss on the encoder).

I understand this requires that the user understands which variables are created by each model part, however, I think that the flexibility you gain by disabling this control is worth it.

jindrahelcl commented 5 years ago

I agree. There's a pull request that refactors the model part implementation, you can put this comment there as a review.

jindrahelcl commented 5 years ago

Každopádně zrovna sdílet proměnný mezi Transformerama nebude fungovat protože dekodér má jiný proměnný (má navíc projekci pro cross attention) a ten saver vyžaduje, aby ty proměnný měly stejný jména a bylo jich stejně

varisd commented 5 years ago

Promenne TransformerEncoder, jsou podmnozinou TransformerDecoder, takze tam neni problem, pokud Decoder nadefinujes jako prvni

jindrahelcl commented 5 years ago

Jo, jasně.. Zamíchal jsem dvě věci dohromady..

jindrahelcl commented 5 years ago

Každopádně to prosím přidej do parameterized.py v tom PR na model refactor, ja to pak udělám..