smith-chem-wisc / FlashLFQ

Ultra-fast label-free quantification algorithm for mass-spectrometry proteomics
GNU Lesser General Public License v3.0
19 stars 14 forks source link

How important is the base sequence format? #76

Closed Dmorgen closed 5 years ago

Dmorgen commented 5 years ago

Hi guys,

I'm using Byonic results, and the peptide format is K.LVNEVTEFAK[+162.053]TC[+57.021]VADESAENC[+57.021]DK.S

removal of the flanking AA with the "." isn't very easy... is there any requirement to remove this?

Thanks! David.

trishorts commented 5 years ago

So nice to hear from you. It's been a while. Rob should be into the office in a couple hours. I'm sure he'll let you know. I hope you and your lab are doing well.

Dmorgen commented 5 years ago

Thanks! :) it's been super-hectic recently, but we're getting there... btw - really liked the Neofusion paper. super impressive!

David.

rmillikin commented 5 years ago

The "Full Sequence" can be whatever you want, but the "Base Sequence" must only be amino acids in the actual peptide sequence. So in this case you'd need:

Full Sequence: K.LVNEVTEFAK[+162.053]TC[+57.021]VADESAENC[+57.021]DK.S
Base Sequence: LVNEVTEFAKTCVADESAENCDK

In Excel, you can use the function: =LEFT(RIGHT(A1,LEN(A1)-2),LEN(A1)-4)

to trim off the trailing amino acids before/after the periods. This will get you to: LVNEVTEFAK[+162.053]TC[+57.021]VADESAENC[+57.021]DK

It's harder to remove the mass-differences via a formula in Excel, but you can do find+replace for [*] to an empty value.

Hope that helps! Happy to hear from you again.

Dmorgen commented 5 years ago

Thanks!