calico / basenji

Sequential regulatory activity predictions with deep convolutional neural networks.
Apache License 2.0
410 stars 126 forks source link

handling blacklist\unmappable #176

Open yardenmatok203 opened 1 year ago

yardenmatok203 commented 1 year ago

Hey David,

How did you handle "unmappable"? they are many areas in the genome.

One more question, does: data/hg38_gaps_binsize2048_numconseq10.bed , handle this blacklist?

Thanks, Yarden

davek44 commented 1 year ago

Hi Yarden, I'm not sure I understand your questions. I don't recognize the filename you sent. You can choose to handle regions of difficult mappability however you'd like; there's no right or wrong way. I typically have the program clip the values in unmappable regions so that they aren't allowed to be very large.