CartwrightLab / dawg

Simulating Sequence Evolution
GNU General Public License v2.0
11 stars 3 forks source link

Init commit for adding root sequence and rate from user #36

Closed zmertens closed 6 years ago

zmertens commented 6 years ago
dng-jenkins commented 6 years ago

Can one of the admins verify this patch?

reedacartwright commented 6 years ago

Look at dawg::residue_exchance for code to covert between residues and strings.

reedacartwright commented 6 years ago

Specifying rates will require some changes to the rate_model class to match supplied rates to nearest rate categories.

zmertens commented 6 years ago

I think it's working now, I tested it and the output was more or less "A" with one or two SNPs sprinkled in (using the test file you sent me).

reedacartwright commented 6 years ago

It is not working for me, have you pushed all of your changes?

reedacartwright commented 6 years ago

I still cannot get your patch to work.

This input file produces correct output:

Tree.Tree = "(A:0.1,B:0.1);"
Root.Length = 10

This input file produces no output:

Tree.Tree = "(A:0.1,B:0.1);"
Root.Seq = "AAAAAAAAAAAA"

This input file produces corrupted output:

Tree.Tree = "(A:0.1,B:0.1);"
Root.Seq = "AAAAAAAAAAAA"
Root.Length = 10

Output:

CLUSTAL multiple sequence alignment (Created by dawg 2-current-rUnknown)

A              !TAAAAAAGG
B              !TTAAAAAAA

What are those ! doing there?

I'm pretty sure codon, RNA, and AA inputs are equally broken.

reedacartwright commented 6 years ago

Also, do not overwrite your branch after you have created a pull request. It completely breaks the tracking and history of the pull request.

zmertens commented 6 years ago

Wasn't sure why the original PR wasn't viewable so I recommitted everything and then force pushed it, hopefully won't have to do that again.

zmertens commented 6 years ago

Should there be an error if the root length is not present (or a check to set the root length to the root sequence's size)?

What if the user specifies a root sequence longer then the root length?

reedacartwright commented 6 years ago

Okay, it seems like you are now making progress.

reedacartwright commented 6 years ago

I made three comments on the patch you need to address next.

reedacartwright commented 6 years ago
reedacartwright commented 6 years ago

Still need to add codon parsing and error checking to encode

zmertens commented 6 years ago

I tried changing the bit packing routine but I'm still getting invalid chars. I'm going to try and look closer how the develop handles errors and generates the codons. I feel like since we know the end-value of a codon it should be easy to match the triplet to the codon.

zmertens commented 6 years ago

Is there any reason for inserting an indel in the user defined root sequence?

zmertens commented 6 years ago

The example file that I had been using may have been bad. I tested it on develop branch and I got output that would suggest the input format was bad (wrong alphabet of letters). I'll attach the file and output for reference.

models_out_from_specify_root_sequence.txt models_out_from_develop.txt

I can confirm that the multiple_models example is consistent when specifying a root sequence and on the develop and that uses a codon model.

zmertens commented 6 years ago

Updated. Kind of forgot how long this PR has been open for, so I didn't need to reopen a new one.