hsigeman / findZX

15 stars 6 forks source link

Usage Questions. recent fusion, results, required read depth and the quality of reference genome. Thanks #14

Closed QianghuiZhu closed 8 months ago

QianghuiZhu commented 8 months ago

Hi! FindZX is a great sofware to identify sex chromosome. But I have some questions while using, hope you could help me to solve it. Hope you could help me to solve it.

Q1. What is the read depth of WGS data requared while used? Is 15× enough, or more?

Q2. There is no reference (ref) genome for my studied species. I see that I can use SPADES to build one. But what is the quality of ref genome required? I want to use 100~120× WGS data to assemble one, but maybe scaffold N50 is only about 100K, is it enough?

Q3. Could the FindZX find recent fusion events between autosome and sex-chromosome(XY or ZW all fused with autosome)? Considered recent fusion may not change the genome coverage and read depth of fused autosome parts.

Q4. I run findZX-synteny function with RefB as _syntenyref, while RefA as _refgenome. So, does all chr name, chr length, and coordinate in figures in results/xxx/output/synteny/xxx/plots/ are based RefB?

Thanks!

hsigeman commented 8 months ago

Hi!

Thanks, that's great to hear!

Here are some answers to your questions: Q1: Yes, 15x should be enough. The samples we used in the findZX paper had read depths between 3x and 30x, and I did not notice any large effects on the results.

Q2: Scaffold N50 of 100K seems good enough to me. In the findZX paper I used the two fragmented reference genomes, from A. palliata (N50: 72 kb) and from A. arvensis (N50: 8kb), and the pipeline was able to pick up the signal from the sex chromosomes in both species.

Q3: Yes, this is the tricky part. FindZX can only identify genome regions where recombination supression has evolved, and where there is some level of sequence divergence between the sex chromosome copies. However, the pipeline was able to distinguish the autosomes and the sex chromosomes in the guppy (P. reticulata), which has extremely low XY differentiation, and also a ~5 Myr old fusion in A. arvensis. So it's worth a try!

Q4: Exactly, the pipeline translates all genome coordinates from RefA to RefB, and uses these for the final tables and plots.

Good luck with your analyses! Hanna

QianghuiZhu commented 8 months ago

Thanks for your reply so quickly. I am sorry for the late reply. It helps me a lot.