calico / basenji

Sequential regulatory activity predictions with deep convolutional neural networks.
Apache License 2.0
410 stars 126 forks source link

None in maps #171

Open yardenmatok203 opened 1 year ago

yardenmatok203 commented 1 year ago

Hey,

Is it possible to have nan in input\prediction?

Also, there are regions in the genome that are problematic, did you ignore them?

Thanks, Yarden

davek44 commented 1 year ago

NaNs indicate that something has gone wrong. Yes, the training data creation stage uses a blacklist of regions that frequently encounter problematic read mapping.

yardenmatok203 commented 1 year ago

where can I find this blacklist?

also, I see there is a variable "seq_hic_nan", so can it be that some of the values are nan?

last question, I have "allValidPairs" format, how can I create cool in your binsize\format?

Thanks, Yarden

davek44 commented 1 year ago

Yes, for HiC NaNs regularly occur, and we interpolate to set the values.

I'm not familiar with allValidPairs format.

yardenmatok203 commented 1 year ago

Do you know which library can I use? In my HIC data I have data with bins: 500, 1000, 1500,....

Thank you, Yarden

On Mon, Aug 14, 2023 at 12:48 AM David Kelley @.***> wrote:

Yes, for HiC NaNs regularly occur, and we interpolate to set the values.

I'm not familiar with allValidPairs format.

— Reply to this email directly, view it on GitHub https://github.com/calico/basenji/issues/171#issuecomment-1676471150, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJYZ7Z2MMITZZ6YJYZTLDLLXVFDUVANCNFSM6AAAAAA24UZLXY . You are receiving this because you authored the thread.Message ID: @.***>