compomics / compomics-utilities

Open source Java library for computational proteomics
http://compomics.github.io/projects/compomics-utilities.html
29 stars 17 forks source link

remove extraneous space in M. tuberculosis check #22

Closed caleb-easterly closed 7 years ago

caleb-easterly commented 7 years ago

M. tuberculosis header should start with ">M. tub", not "> M. tub." The current header would not identify M. tuberculosis FASTAs.

hbarsnes commented 7 years ago

Thanks for the suggested fix. However, this change breaks our unit tests, so we'll have to look into this and get back to you. Pretty sure your are correct though. BTW, I note that you removed the period after "tub" as well? Is this also required to parse the M. tuberculosis headers correctly?

hbarsnes commented 7 years ago

I just deployed utilities v4.11.16 which should fix the issue with the M. tuberculosis FASTA headers, unless the removal of the period after "tub" was also required? Please let me know if the new utilities version solves the problem.

PS: If you also need new versions of SearchGUI and PeptideShaker, I'll take care of that tomorrow.

caleb-easterly commented 7 years ago

I'm not sure - the header seemed to be referring to TubercuList, which is of the form >M. tuberculosis H37Rv|Rv0001|dnaA Was that database the target? I haven't found other systematic headers for M. tuberculosis. Thank you for the quick update - I just ran into it while I was working on a tool that uses the Header class, so no hurry necessary for SearchGUI and PeptideShaker.

hbarsnes commented 7 years ago

According to the documentation our test case came from the TB Database, but it could be that the format has since been altered. I've now removed the period after the "tub" in the if statement, which means that both headers from TubercuList and our (possible outdated) headers from the TB Database should now be supported. Deployed as utilities v4.11.17.