calico / borzoi

RNA-seq prediction with deep convolutional neural networks.
Apache License 2.0
80 stars 10 forks source link

Size of the final cropping layer #15

Open avantikalal opened 7 months ago

avantikalal commented 7 months ago

Hi,

based on the parameters file, it seems that the number of positions cropped on either end of the output in Borzoi is 16 (https://github.com/calico/borzoi/blob/main/examples/params_pred.json#L69). However, according to the preprint:

In order to avoid less accurate predictions on the sequence boundaries (due to asymmetric visibility), we cropped from each side to focus the loss computation on the center 196, 608 bp.

Based on my calculations, this implies that the number of bases cropped on either end should be ((524288-196608)/32)/2 = 5120. Could you clarify the reason for choosing 16? Thanks!