Closed shiltemann closed 6 months ago
Ok @Deeptivarshney, I tried to handle it nicely on the TAPscan side, but the best/easiest solution is still to make some changes to the FASTA headers. I now add an underscore separator between the lettercode and organel/source suffice
>ACTCHmt_XXYYZZ
->>ACTCH_mt_XXYYZZ
This way everything is interpreted correctly in the TAPscan website.
Is that ok for you?
Hi @shiltemann
The standard method to represent organelle proteins is typically like ACTCHmt_XXYYZZ, according to GenomeZoo scripts/workflows. However, if the modified version works well for the TAPscan website, then it's fine for me too.btw, why are you trying to modify those? In the GenomeZoo repo, we clearly mention the use of those prefix signs, and users can easily understand them by looking at it (Or I am missing something ?)
@Deeptivarshney it is more so that TAPscan website can understand them better, for example it splits the FASTA headers on the first underscore in order to determine which species a sequence belongs to. Same for the TAPs since the fasta headers are used in the TAPscan output.
So this is less for the humans, more for the computers ;) ..would it be a lot of work on your side to change the GenomeZoo scripts? If so, I can think of a different solution, but this is the easiest from the TAPscan website point of view.
@shiltemann , Cool, no worries. I'll make the changes in the GenomeZoo repo. Just wanted to clarify that :)
awesome, you rock!
Ok @Deeptivarshney, I tried to handle it nicely on the TAPscan side, but the best/easiest solution is still to make some changes to the FASTA headers. I now add an underscore separator between the lettercode and organel/source suffice
>ACTCHmt_XXYYZZ
->>ACTCH_mt_XXYYZZ
This way everything is interpreted correctly in the TAPscan website.
Is that ok for you?