Closed christophmluscher closed 1 year ago
Some tests regarding the downsampling shapes (similar to #4) would be nice :)
I have two questions here,
stride
should be added to the Config, since usually we set stride in Conv2d to apply the time subsamplingI have two questions here,
- the param
stride
should be added to the Config, since usually we set stride in Conv2d to apply the time subsampling
already added but not pushed yet :)
- after the front end convolution models, we need to apply one additional linear layer to project the dimension back to the model size. Should the linear layer be included in the front end, or should it be included in the ConformerBlock? I would prefer the former
I was thinking about putting this as an option into the frontend: with param int N
-> add linear layer with N
outputs, None -> no linear layer
@Atticus1806
tests fail due to the seq mask update missing. I am not 100% sure how to perform the update in a clean fashion. Is there a PyTorch way to do this?
tests fail due to the seq mask update missing. I am not 100% sure how to perform the update in a clean fashion. Is there a PyTorch way to do this?
I'm not exactly sure what you refer to. What update do you mean? You mean how to compute the seq mask which is supposed to be returned by the frontend? As I mentioned, you could apply maxpooling with the same striding and kernel size just as the other operations.
you could apply maxpooling with the same striding and kernel size just as the other operations.
this would also work for conv layers?
I will move the other frontend to another PR.
@Atticus1806 tested this within the conformer so should be ok now
everything else I would like to move to follow up PRs so that we can merge and start using this
@JackTemaki @albertz
I would change the code to only support tuples instead of both tuples and integer, this makes the code more readable and more consistent.
I'm not sure I agree. PyTorch itself also supports both. And the more common use case it that the user provides just a single int.
I have two questions here,
- the param
stride
should be added to the Config, since usually we set stride in Conv2d to apply the time subsamplingalready added but not pushed yet :)
Was this added in the final version of the PR, I don't seem to find it..
This is not specific for the Conformer so maybe
parts/conformer
is not the right place?