chaoszhang / ASTER

Accurate Species Tree EstimatoR: a family of optimation algorithms for species tree inference (including ASTRAL & CASTER)
GNU Affero General Public License v3.0
91 stars 10 forks source link

branch lengths empty #20

Closed CEPHAS-01 closed 6 months ago

CEPHAS-01 commented 8 months ago

Hi Thanks for this tool. I have been interacting with it and got some results.

I encountered a null branch length for the branches on the inferred tree when I ran the command:

Command caster-site_branchlength -t $threads -o $outFile -i $inFile -f list

Output (((((((D:0.000000,B:0.000000)0.0:-nan,A:0.000000)0.0:-nan,C:0.000000)0.0:-nan,E:0.000000)0.0:-nan,F:0.000000)0.0:-nan,G:0.000000):0.000000,H:0.000000);

Is there something I am doing wrong?

Thank you for your help in advance.

TLag

chaoszhang commented 8 months ago

Hi

Thanks for trying CASTER. Branch length estimation is an experimental feature which may output funny values.

However, in your particular case, I see zero support for branches, which is worrisome.

As you added "-f list" suggesting you have mutiple fasta files, can you send me the list file ($inFile) and some of the fasta files? If possible, please also send me log files for "caster-site_branchlength" and "caster-site" as well.

Thanks again for trying CASTER.

Best, Chao

CEPHAS-01 commented 8 months ago

Hi Chao,

Thanks so much for your prompt response and readiness to help.

Unfortunately, I am unable to share the fasta files with you because they are from assemblies that in progress and are yet to be published. Perhaps I will have to do without the branch lengths at this time.

TLag

CEPHAS-01 commented 8 months ago

Hi Chao,

One more question. When using this tool on the whole genome with multiple chromosomes, I encounter

Processing chr01.fasta... File 'chr01.fasta' is ill-formated.

I had encountered the same error while using each genome (containing multiple chromosomes in multi-fasta file) as input but reading through the multi-fasta input section of your README, I decided to group the chromosomes together in the same file while using the species name as the fasta header in each file such that

chr01.fasta file contains:

 >speciesA
 ACTG
 >speciesB
 CTAG
 >speciesC
 GATC

etc

This structure is what I used as input and got the error I posted above.

Any suggestions on the best way to input whole genome (containing multiple chromosomes in multi-fasta file) assemblies into the tool?

Thank you in advance for your help.

TLag

chaoszhang commented 7 months ago

I don't see any problems in chr01.fasta you displayed. The list format you are currently using is the most recommended format. For further diagnosis of your specific case, you can send me one chromosome file via E-mail (chaozhang@berkeley.edu).

chaoszhang commented 7 months ago

Hi,

Please try the new version to see if it gives more useful debug information.

Best, Chao

CEPHAS-01 commented 7 months ago

Hi Chao,

So sorry I have not sent the chromosome file, had it on my to-do list. Since there is a newer version, I will try the new version first and then give you feedback.

Thank you so much.

TLag