Open pixuenan opened 2 weeks ago
For a given peptide precursor (a combination of peptide sequence, charge and modification), if there is no any spectra matched from the query, it will print out something like "*.mgf doesn't exist". This is not an error message from the search.
It’s quite common for some peptides not to have any spectra matched in a query
Thanks for the reply
May I know is there a way to query multiple protein sequences as in a single input file in the stand alone version? It seems that only multiple peptides query in a single input file is supported now.
Yes, you could put your protein sequences in a FASTA format file like the one below and then set parameter as "-i target_proteins.fasta -t protein -s 1". This only works for novel protein search not known protein search.
target_proteins.fasta :
>sp|A0A087WT01|TVA27_HUMAN T cell receptor alpha variable 27 OS=Homo sapiens OX=9606 GN=TRAV27 PE=1 SV=1
MVLKFSVSLLWLQLAWVSTQLLEQSPQFLSLQEGENLTVYCNSSSVFSSLQWYRQEPGEG
PVLLVTVVTGGEVKKLKRLTFQFGDARKDSSLHLTAAQTGDTGLYLCAG
>sp|A0A1B0GTB2|TUNAR_HUMAN Protein TUNAR OS=Homo sapiens OX=9606 GN=TUNAR PE=1 SV=2
MVLTSENDEDRGGQEKESKEESVLAMLGLLGTLLNLLVLLFVYLYTTL
>sp|A0A1W2PP97|THSD8_HUMAN Thrombospondin type-1 domain-containing protein 8 OS=Homo sapiens OX=9606 GN=THSD8 PE=3 SV=2
MARTPGALLLAPLLLLQLATPALVYQDYQYLGQQGEGDSWEQLRLQHLKEVEDSLLGPWG
KWRCLCDLGKQERSREVVGTAPGPVFMDPEKLLQLRPCRQRDCPSCKPFDCDWRL
>sp|A0AUZ9|KAL1L_HUMAN KAT8 regulatory NSL complex subunit 1-like protein OS=Homo sapiens OX=9606 GN=KANSL1L PE=1 SV=2
MTPALREATAKGLSFSSLPSTMESDKMLYMESPRTVDEKLKGDTFSQMLGFPTPEPTLNT
NFVNLKHFGSPQSSKHYQTVFLMRSNSTLNKHNENYKQKKLGEPSCNKLKNLLYNGSNLQ
LSKLCLSHSEEFLKKEPLSDTTSQCMKDVQLLLDSNLTKDTNVDKVQLQNCKWYQENALL
DKVTDAELKKGLLHCTQKKLVPGHSNVPVSSSAAEKEEEVHARLLHCVSKQKLLLSQARR
TQKHLQMLLAKHVVKHYGQQMKLSMKHQLPKMKTFHEPTTLLGNSLPKCTELKPEVNTLT
AENKLWDDAKNGFARCTAAELQRFAFSATGLLSHVEEGLDSDATDSSSDDDLDEYTLRKN
VAVNCSTEWKWLVDRARVGSRWTWLQAQLSDLECKLQQLTDLHRQLRASKGLVVLEECQL
PKDLLKKQMQFADQAASLNLLGNPQVPQECQDPVPEQDFEMSPSSPTLLLRNLEKQSAQL
TELLNSLLAPLNLSPTSSPLSSKSCSHKCLANGLYRSASENLDELSSSSSWLLNQKHSKK
KRKDRTRLKSSSLTFMSTSARTRPLQSFHKRKLYRLSPTFYWTPQTLPSKETAFLNTTQM
PCLQSASTWSSYEHNSESYLLREHVSELDSSFHSVLSLPSDVPLHFHFETLLKKTELKGN
LAENKFVDEYLLSPSPVHSTLNQWRNGYSPLCKPQLRSESSAQLLQGRKKRHLSETALGE
RTKLEESDFQHTESGSHSNFTAVSNVNVLSRLQNSSRNTARRRLRSESSYDLDNLVLPMS
LVAPAKLEKLQYKELLTPSWRMVVLQPLDEYNLGKEELEDLSDEVFSLRHKKYEEREQAR
WSLWEQSKWHRRNSRAYSKNVEGQDLLLKEYPNNFSSSQQCAAASPPGLPSENQDLCAYG
LPSLNQSQETKSLWWERRAFPLKGEDMAALLCQDEKKDQVERSSTAFHGELFGTSVPENG
HHPKKQSDGMEEYKTFGLGLTNVKKNR
Thanks, that helps a lot. May I ask how to say a protein search result is confident or not? By looking at the pepquery result, the psm_rank.txt is reported at the peptide level. Is there any downstream analysis required for the novel protein identification?
We have some description at http://pepquery.org/document.html#saoutput to show how to interpret the result in the psm_rank.txt file, such as how a match is considered as confident in a query.
Hi, thanks for creating this tool. But recently when I use pepquery2 in web application and stand alone version, there was error about the mgf file in s3 bucket is not exist. I attached the screenshot of the error on web application.
My input peptide is
MAEASPHPGRYFCHCCSVEIVPRLPIISVQDASLVLSRSFRKRPEHRKWFCPLHSSHRPEPATVGHVDQHLFTLPQGYGQFAFGIFDDSFEIPTFPPGAQADDGRDPESRRERDHPSRHRYGARQPRARLTTRRATGRHEGVPTLEG