bayesiancook / pbmpi

phylobayes mpi
GNU General Public License v2.0
23 stars 9 forks source link

interpreting the output of readpb_mpi #22

Closed berkalpay closed 7 months ago

berkalpay commented 3 years ago

Hello and thank you for your wonderful work.

I'd like to analyze the posterior parameters of MG MutSelDP. To perform MCMC, I run

mpirun -n 5 pb_mpi -d ../aligned_RNA_seqs_postprocessed.phylip -T ../aligned_rna_seqs_postprocessed_phylip_phyml_tree.txt -cat -gtr -mutsel chainname

When I run readpb_mpi chainname, a variety of files appear that have no headers or column labels. I am also unsure what each file refers to. Is there a guide/intuition for interpreting this output with respect to the notation in Rodrigue 2010?

Is there related information for interpreting the headers in the .trace?

Thank you for any help.

bayesiancook commented 3 years ago

Hello Berk,

The various files produced were for specific applications at the time. If you're interested in the parameters of the substitution process, the nucleotide level parameters are part of the .trace file. The nuleotide frequency parameters are under the columns named 'nucsA', 'nucsC', 'nucsG', and 'nucT', and the exchangeability parameters are under the columns named 'nucrrAC', 'nucrrAG', ... , 'nucrrGT'. The parameterss controling amino acid fitness are reported in a file created by the readpb_mpi command, called .aap. This file contrains the posterior mean value associated to each amino acid for each site. The first line has two numbers, one indicating how many lines are reported below it (which should be the number of codon sites in the alignment) and the number of columns (which should be 20, thus, one column per amino acid, in alphabetical order).

I hope this is sufficient for your needs.

Best,

Nicolas Rodrigue

bayesiancook commented 7 months ago

closed