tangerzhang / ALLHiC

ALLHiC: phasing and scaffolding polyploid genomes based on Hi-C data
170 stars 39 forks source link

superscaffolds #33

Closed diriano closed 4 years ago

diriano commented 4 years ago

Could you please explain a bit in detail what is expected as input by the scriptALLHiC/scripts/link_superscaffold.pl Thanks Diego

tangerzhang commented 4 years ago

Hi Diego, Please check the last 15 lines (Line 100-115) in the script (link_superscaffold.pl). Format of input information is listed there. The first column is seqID from 1 to N (N=16 in our case) and N is the number of groups you specified in ALLHiC_partition step. The second column is the target group ID generated from ALLHiC_partition (group1 to group16 in our case). The third column to the last column list all of potential allelic superscaffolds to the target group in column 2. Hi-C signals between target group and allelic superscaffolds will be removed from this script. These information can be determined based on a synteny plot generated from jcvi package. Detail information of the case supplied in link_superscaffold.pl can be found in our NP paper (Supplementary Figure 28). Feel free to contact me for more details.