srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
824 stars 342 forks source link

DeepBiLSTM #201

Open efosler opened 5 years ago

efosler commented 5 years ago

Is there a reason that DeepBiLSTM has two variants (relu and non-relu)? I was thinking about creating a UniLSTM, and realized that all that really needed to happen were some flags being flipped. It seems like having one model file, DeepLSTM, would be better and have the activations and directionality be options to that model. The diffs between relu and non-relu seem minor.

For backwards compatibility, we could have model_factory just call the new function with the options set appropriately.

ramonsanabria commented 5 years ago

Hi Eric,

Yes, "relu and non-relu" is something that I was trying long time ago. I think it was not very important.

Yes correct. This was the idea of model_factory to decouple models and IO infrastructure. We can even model this further and have a layer_factory (?). This was another idea that I had in my head long time ago. What do you think?

Thanks!

efosler commented 5 years ago

Should I bother trying to put the two back together? It would not be hard but if it's not on the critical path then I'm not going to bother. It shouldn't take more than 20 minutes to do.

I think I see what you mean, but just to clarify: how do you separate the models from the layers?