Closed stasys-hub closed 1 month ago
Hi stasys-hub,
Sorry for the WAY late reply.
You got everything correct, the last layer has indeed a large input tensor, this is how the model ends up with 10M + parameters.
I am sure one could come up with a smarter way to make that more parameter efficient nowadays.
Cheers
Thank you for this awesome model!
Out of curiosity I am trying to reimplement the 1D DNA convolutional part of the neural network in pytorch but found it a bit hard to understand the "log output". Therefore i wanted to ask whether it would be possible to clarify some of my asssumptions?
When looking at the architecture I would say the first part until the dilated convolutions is clear. Assuming the Pytorch convention of (N,Cin,L), where N denotes the batch size, Cin the number of channels and L the length of the vector we would have an input of shape :
[N, 4, 1_005_000]
. Assuming a "same" padding (Could not find anything in the log or the original publication) we would get the following sheme:[N, 4, 1_005_000] # input
[N, 300, 1_005_000] # layer 1 out
[N, 300, 251250] # pool 1 out
[N, 600, 251250] # layer 2 out
[N, 600, 50250] # pool 2
[N, 600, 50250] # layer 3 out
[N, 600, 10050] # pool 3 out
[N, 900, 10050] # layer 4 out
[N, 900, 2010] # pool 4 out
[N, 900, 2010] # layer 5 out
[N, 900, 1005] # pool 5 out
[N, 100, 1005] # layer 6 out
[N, 100, 1005] # pool 6 out
Now this part seems to be followed by 9 dilated 1d convolutions (10 are mentioned in the paper). Dimensions therefore stay the same and this is followed by a fully connected layer mapping to the HiC Skeleton of dim
[N, 201]
. EDIT: And this last part seems to be a bit confusing to me. Given that the last output dimension is[N, 100, 1005]
that would give flattened[N, 100_500]
a really big tensor right before the last fully connected layer. Am I missing something?I hope I understood everything else right, and if that's not the case I would be really glad if you could clarify on that if possible!
Thank you very much in advance!