kingsfordgroup / armatus

BSD 2-Clause "Simplified" License
25 stars 10 forks source link

Consensus output format? #20

Open Ziwei-Liu opened 3 months ago

Ziwei-Liu commented 3 months ago

Hi team,

Thank you for your work. I used armatus to call TADs in my genome, and a consensus TAD table was produced at a resolution of 40kbps. I noticed that many TADs are closely adjacent, like shows below:

LG01    680000  759999
LG01    760000  879999
LG01    880000  959999
LG01    1080000 1199999
LG01    1200000 1279999
LG01    1280000 1359999
LG01    1360000 1719999
LG01    1720000 1799999
LG01    1800000 1879999
LG01    2120000 2239999
LG01    2320000 2399999
LG01    2400000 2479999
LG01    2480000 2559999
LG01    2680000 2759999
LG01    2760000 2879999
  1. How should I distinguish TAD boundries from this? Is there 15 distinct TADs, or should i merge adjacent TADs like LG01 2680000 2759999 and LG01 2760000 2879999 together into a large TAD as LG01 2680000 2879999?
  2. Is that means the region between two non-adjacent TADs, like the region between LG01 880000 959999 and LG01 1080000 1199999, is a TAD boundry, or no matter how large it is, all regions between calculated TADs should be treated as TAD boundries? In another word, should LG01 2320000 2399999 and LG01 2400000 2479999 has a TAD boundry of 0 bps, or should the two adjacent region just be treated as a large TAD without considering any boundries in it?

Thank you!