roblanf / BenchmarkAlignments

Benchmark empirical datasets for phylogenetic method development
Other
16 stars 11 forks source link

add a csv file with alignment information #47

Open roblanf opened 1 year ago

roblanf commented 1 year ago

At the moment all the useful information is in nexus format, which can be annoying to work with.

E.g. we have this:

begin SETS;

    [partitions]
    CHARSET COI_1stpos = 1-1592\3;
    CHARSET COI_2ndpos = 2-1592\3;
    CHARSET COI_3rdpos = 3-1592\3;
    CHARSET 16S = 1593-3037;

    [loci]
    CHARPARTITION COI = 1:COI_1stpos, 2:COI_2ndpos, 3:COI_3rdpos;
    CHARPARTITION 16S = 1:16S;

    CHARPARTITION loci = 1:COI, 2:16S;

    [genomes]
    CHARPARTITION   mitochondrial_genome = 1:COI, 2:16S;

    CHARPARTITION genomes = 1:mitochondrial_genome;

But this could be represented as a csv file with the following columns:

We could then use the csv file when entering the data, and build the nexus block directly from the csv file.

roblanf commented 1 year ago

also include a column for 'datatype' e.g. DNA, AA, etc. This comes from the top of the nexus alignment file.

DS4B-ANU commented 11 months ago

include a column for codon position too (NA if it's not a codon position), so now the columns are: