SunPengChuan / wgdi

WGDI: A user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes
https://wgdi.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
114 stars 22 forks source link

Ancestor is unknown #26

Open ardy20 opened 1 year ago

ardy20 commented 1 year ago

Hi SunPeng

What is the best approach when the ancestor is unknown and there is no information about ancestor of testing plant species?

Regards

SunPengChuan commented 1 year ago

The ancestor file need you provide or download from https://github.com/SunPengChuan/Angiosperm-karyotype-evolution/tree/master/Karyotype. You can also use the wgdi ’-km‘ to get ancestor files of other species

ardy20 commented 1 year ago

Hi

I am trying to create an ancestral file for Simmondsia chinensis but do not know how to choose parents or ancestors for that because it is the only species in Simmondsiacea order (belong to Caryophyllales). Could you please explain?

Also, for preparation of total.config for -km command what are:

ancestor_left = ancestor_top =

And how to prepare them?

Should I use: genome1_name = Amborella trichopoda genome2_name = Acorus gramineus

SunPengChuan commented 1 year ago

Your species is a core eudicot, so you should use ACEK(ancestor core eudicots karyotype) or AEK(ancestor eudicots karyotype), ACEK is the best. https://github.com/SunPengChuan/wgdi-example/tree/main/Karyotype_Evolution_Example/ancestor/ACEK https://github.com/SunPengChuan/Angiosperm-karyotype-evolution/tree/master/Karyotype/AEK

ardy20 commented 1 year ago

OK, Then which files are these (in -km)?

ancestor_left = ancestor_top =

SunPengChuan commented 1 year ago

You'd better take a closer look at the examples here. https://github.com/SunPengChuan/wgdi-example/blob/main/Karyotype_Evolution.md As in the previous process, we use WGDI with "-d, -icl, -bi, -c, -km, -d" parameters to compare C. japonicum , V. vinifera with ACEK. It should be noted that we need to select paralogs blocks for mapping. Therefore, we need to confirm that the input csv file of the '-km' parameter does not contain other blocks through the '-bk' parameter.

ardy20 commented 1 year ago

Thanks. Do we have a guide how you have prepared AEK or ACEK? I found one only for AAK here: (https://github.com/SunPengChuan/Angiosperm-karyotype-evolution/blob/master/AAK_construction/commond.md)

SunPengChuan commented 1 year ago

I don't think you have read this. https://github.com/SunPengChuan/wgdi-example/blob/main/Karyotype_Evolution.md

SunPengChuan commented 1 year ago

Any Questions?

Botantisty commented 1 year ago

Hello,

Thank you for your work on this project and for the link to this https://github.com/SunPengChuan/wgdi-example/blob/main/Karyotype_Evolution.md, which is hard to find in the labyrinth of this site.

If I am understanding correctly, we need the cds file to run -ks so that we can run -bi in order to compare our target species with your AEK? This isn't included in the set in the folder https://github.com/SunPengChuan/Angiosperm-karyotype-evolution/tree/master/Karyotype/AEK.

Thanks for your help.

SunPengChuan commented 1 year ago

Is this link not accessible? https://github.com/SunPengChuan/wgdi-example/blob/main/Karyotype_Evolution.md There are no CDS in karyotype evolution, because we have no way to get the sequence of the ancestor, but replace them with the homologous gene of the current species. You don't need to run -ks, but use 'ks=none' in the conf of run -bi.

Botantisty commented 1 year ago

Thank you for the guidance. It wasn't clear from the documentation that you could pass "none" to -bi for the ks file.

Do you have any suggestions for tuning parameters, if you are trying to create your own ancestor? My species seem to have gone through a lot since AEK! I've been following the step you took with T. sinense, but you don't go into many details about the parameters used at each step in this example and how they might be altered.

One of the issues I am having is that when the chromosome is mapped to itself it shows rearrangements. (This would be the equivalent of your T. sinense Chr22 (or AEK 1) mapping to T. sinense Chr22, but indicating that there has been a chromosomal rearrangement). I'm wondering which parameters would be the most likely to prevent this and at which step? I've tried playing with the p-values in the -c step, but that hasn't helped and I am wondering if I need to go back and change parameters like score or mg at the -icl step.

https://github.com/SunPengChuan/wgdi-example/blob/main/Karyotype_Evolution.md is accessible.

Above I should have pasted the other link you shared to the AEK. It can be hard to find the specific information to help with a particular tool/step when you are linked to the https://github.com/SunPengChuan/wgdi-example page and have to burrow in yourself.