aaranyue / quarTeT

A telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification
http://atcgn.com:8080/quarTeT/home.html
93 stars 7 forks source link

how to figure the output of centrominer #20

Closed XiaoTW123 closed 5 months ago

XiaoTW123 commented 11 months ago

Dear quartet authors, I don't understand the outputs of quartet.

  1. In the following fig, do the first and last chromosomes have a centromere? How to judge if it is intact or just partial?
  2. the last chromosome looks weird, does indicate an error in the scaffolding process?
  3. What does the "region score" mean? Is there a threshold to indicate the result relibale or not ? Thank you. image
Echoring commented 11 months ago

In the following fig, do the first and last chromosomes have a centromere? How to judge if it is intact or just partial?

This figure draws the most possible centromere region. If no evident centromere region is found, it may draw a weird centromere. You should check .candidate file and combine .gff3 file with other data using visualization tools to manually check it. In your case, it looks like centromere on the first chromosome is not found, and the last chromosome seems reasonable but require more check.

the last chromosome looks weird, does indicate an error in the scaffolding process?

No, this is a figure issue. If predicted centromere is very near to the end of chromosome, it will look like this.

What does the "region score" mean? Is there a threshold to indicate the result relibale or not?

"region score" = ("region tandem repeat total length" + 1/10 * "region TE total length") / "region length" This score is used for sorting result and decide the best candidate. Typical centromere can have near 1 score, but there might be atypical centromere so no valid threshold.

XiaoTW123 commented 11 months ago

@Echoring OK, thank you!