Open yardenmatok203 opened 1 year ago
NaNs indicate that something has gone wrong. Yes, the training data creation stage uses a blacklist of regions that frequently encounter problematic read mapping.
where can I find this blacklist?
also, I see there is a variable "seq_hic_nan", so can it be that some of the values are nan?
last question, I have "allValidPairs" format, how can I create cool in your binsize\format?
Thanks, Yarden
Yes, for HiC NaNs regularly occur, and we interpolate to set the values.
I'm not familiar with allValidPairs format.
Do you know which library can I use? In my HIC data I have data with bins: 500, 1000, 1500,....
Thank you, Yarden
On Mon, Aug 14, 2023 at 12:48 AM David Kelley @.***> wrote:
Yes, for HiC NaNs regularly occur, and we interpolate to set the values.
I'm not familiar with allValidPairs format.
— Reply to this email directly, view it on GitHub https://github.com/calico/basenji/issues/171#issuecomment-1676471150, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJYZ7Z2MMITZZ6YJYZTLDLLXVFDUVANCNFSM6AAAAAA24UZLXY . You are receiving this because you authored the thread.Message ID: @.***>
Hey,
Is it possible to have nan in input\prediction?
Also, there are regions in the genome that are problematic, did you ignore them?
Thanks, Yarden