Closed lauradoepker closed 7 years ago
It shouldn't be doing this as far as I know. Can you point to some specific examples? If the sequence names are truncated, the corresponding links would likely be broken; Is that something you're seeing?
This might be because of Phylip trimming sequence names? There is trimming going on elsewhere (for the seeds, right?), but that ends up getting reversed. It's possible we just need to add reversal or some other more general coding scheme to get around the limitation. Thoughts @WSDeWitt?
@lauranoges I'm guessing you're back from your trip? Would you mind taking a look at this again to see if you can find any specific examples of this?
I just looked through stoat:5002 and I can no longer find any examples, so let's close this issue. Sorry to cause a fuss! @metasoarous
@lauranoges No worries! It may have a been an older version of the code messing things up. Better to raise the issue and check than let the bugs sneak under the rug :-)
@lauranoges It looks like when seed sequences cluster together, the stuff Will wrote to fix up the sequence names only fixes the "main" seed sequence. So the other seed sequence name does come out the other end trimmed. This is thwarting some of my work on #99, and prompted a push on #149, but unfortunately, neither of the alternatives we looked at in #149 seem to be panning out well. Not sure yet whether I'm going to patch up over this downstream of dnaml/dnapars for #99 specifically or fix this once and for all at the source, but in any case, this issue should be reopened.
Hi @metasoarous, I think the leaf names are abbreviated on the trees, maybe so that the mf %s can be there too? Anyway, when I try to find the original sequence for a given sequence number, it turns out that it doesn't exist as-is... but there are longer sequence names that include the leaf name that do exist. Did I explain the problem well enough?
Just curious if you're aware of this. Not a current problem for us, but should be considered for future platforms.