calico / basenji

Sequential regulatory activity predictions with deep convolutional neural networks.
Apache License 2.0
411 stars 126 forks source link

"When the bin size needs to be adjusted (e.g. from 128 to 100), how should I design the optimal pooling kernel size?" #148

Closed zhanxiangzong closed 1 year ago

zhanxiangzong commented 1 year ago

Hello, thank you very much for your work, which is of great help to us in exploring the function of regulatory sequences. However, I have a small question and hope to get your help.

"The default setting of the Basenji model is to pool the nucleotide-resolution values to 128 bp bins. How should we adjust the pooling kernel size in the model parameters for optimal performance when we want to use a different bin size?"

"For example, in the examples provided in the tutorials, the three pooling kernels have sizes of 8, 4, and 4, which correspond exactly to 128 bp bins. If I want to change the bin size to 100, how should I set the pooling kernel sizes to achieve optimal performance? Should they be set to 4, 5, and 5, or something else? Is there any theoretical guidance for this setting?"

davek44 commented 1 year ago

I would just try to keep the pooling sizes as small as possible. 100 is tough because you're stuck with two layers in which you'll have to apply width five pooling. I think I'd apply them up front so 5, 5, 2, 2, but I'm not really sure.

zhanxiangzong commented 1 year ago

Thank you very much for your answer. It is very helpful to me.