Closed yardenmatok203 closed 1 year ago
what is "Unmappable"?
thanks, Yarden
Unmappable specifies regions that are particularly difficult to map short reads to due to repeats.
How should I use this mapping regions for prediction with your weights? For hff for example? How did you use those areas in training?
Thank you, Yarden
בתאריך יום ב׳, 14 באוג׳ 2023, 23:18, מאת David Kelley < @.***>:
Unmappable specifies regions that are particularly difficult to map short reads to due to repeats.
— Reply to this email directly, view it on GitHub https://github.com/calico/basenji/issues/174#issuecomment-1678000362, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJYZ7Z4NPFVBJRIQHNPCE3LXVKB2XANCNFSM6AAAAAA3OO3ZSQ . You are receiving this because you authored the thread.Message ID: @.***>
Hey David,
one more question, does: data/hg38_gaps_binsize2048_numconseq10.bed , handle this blacklist?
Thanks, Yarden
Hi Yarden, I'm not sure I understand your questions. I don't recognize the filename you sent. You can choose to handle regions of difficult mappability however you'd like; there's no right or wrong way. I typically have the program clip the values in unmappable regions so that they aren't allowed to be very large.
Hey,
can you share your blacklist which indicates which areas in the genome are problematic for prediction?
Thanks, Yarden.