Author Name: Janet Young (Janet Young)
Original Redmine Issue: 3332, https://redmine.open-bio.org/issues/3332
Original Date: 2012-03-06
Original Assignee: Bioperl Guts
Hi there,
I’ve been running pairwise codeml on a whole bunch of alignments, and am occasionally running into examples that fail to parse. Here’s one where I’ve tracked down why - a sequence name that is too long can cause trouble. I can see why, I think - the portion of the mlc file that describes the Nei-Gojobori matrix loses an important space if the sequence name is too long.
A good N-G matrix looks like this:
Nei & Gojobori 1986. dN/dS (dN, dS)
(Note: This matrix is not used in later ML. analysis.
Use runmode = –2 for ML pairwise comparison.)
and one that doesn’t parse looks like this (no space after the second sequence name
————-
Nei & Gojobori 1986. dN/dS (dN, dS)
(Note: This matrix is not used in later ML. analysis.
Use runmode = –2 for ML pairwise comparison.)
I think it might only be a problem for a minority of comparisons where there is no sequence divergence, but I’m not sure. I’ll attach a test script that should demonstrate the problem clearly, but please let me know if more explanation would be helpful.
thanks very much,
Janet Young
Dr. Janet Young
Tapscott and Malik labs
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
Author Name: Janet Young (Janet Young) Original Redmine Issue: 3332, https://redmine.open-bio.org/issues/3332 Original Date: 2012-03-06 Original Assignee: Bioperl Guts
Hi there,
I’ve been running pairwise codeml on a whole bunch of alignments, and am occasionally running into examples that fail to parse. Here’s one where I’ve tracked down why - a sequence name that is too long can cause trouble. I can see why, I think - the portion of the mlc file that describes the Nei-Gojobori matrix loses an important space if the sequence name is too long.
A good N-G matrix looks like this:
Nei & Gojobori 1986. dN/dS (dN, dS) (Note: This matrix is not used in later ML. analysis. Use runmode = –2 for ML pairwise comparison.)
seq1a aaaaaaaaaaaaaaaaaaa –1.0000 (0.0000 0.0000) ————-
and one that doesn’t parse looks like this (no space after the second sequence name ————- Nei & Gojobori 1986. dN/dS (dN, dS) (Note: This matrix is not used in later ML. analysis. Use runmode = –2 for ML pairwise comparison.)
seq1a aaaaaaaaaaaaaaaaaaaa-1.0000 (0.0000 0.0000) ————-
I think it might only be a problem for a minority of comparisons where there is no sequence divergence, but I’m not sure. I’ll attach a test script that should demonstrate the problem clearly, but please let me know if more explanation would be helpful.
thanks very much,
Janet Young
Dr. Janet Young
Tapscott and Malik labs
Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., C3-168, P.O. Box 19024, Seattle, WA 98109-1024, USA.
email: jayoung …at… fhcrc.org