MSGFPlus / msgfplus

MS-GF+ (aka MSGF+ or MSGFPlus) performs peptide identification by scoring MS/MS spectra against peptides derived from a protein sequence database.
Other
72 stars 36 forks source link

No peptides matched to decoy portion of the database #143

Closed LoayJabre closed 1 year ago

LoayJabre commented 1 year ago

Hi - I'm running MSGF+ through OpenMS and I'm facing a persistent error that No peptides were matched to decoys in my database. When I run my script, a revCat.fasta file is generated, and when I inspect it manually, it contains decoys.

My script is as follows:

# Database searching and fdr application
set -e 
set -o xtrace 

database_fasta_file=$1
mzml_folder=$2

DIR=$mzml_folder
for FILE in "$DIR"*.mzML
do
    echo "Processing $FILE file..."
        temp_string=${FILE/.mzML/}

# formatting the input names so that they can properly feed into the database search
        db_string=${database_fasta_file/.fasta/}
        db_string_adjusted=$db_string'.fasta'
        db_string_revcat=$db_string'.revCat.fasta'
        echo $db_string
        echo $db_string_adjusted
        echo $db_string_revcat

# running the database search
        MSGFPlusAdapter -in $FILE -executable /software/MSGFPlus-022.04.18/MSGFPlus.jar -database $db_string_adjusted -out $temp_string'.idXML' -PeptideIndexing:decoy_string 'XXX_'  -PeptideIndexing:decoy_string_position 'prefix' -add_decoys 'true' -fixed_modifications 'Carbamidomethyl (C)' -threads 12 -java_memory 50000
        PeptideIndexer -in $temp_string'.idXML' -fasta $db_string_revcat -out $temp_string'_PI.idXML' -decoy_string 'XXX_' -threads 6 
        FalseDiscoveryRate -in $temp_string'_PI.idXML' -out $temp_string'_FDR.idXML' -PSM 'true' -FDR:PSM 0.01 -threads 4
done

When I run the same script but direct the -database to where the revCat.fasta file is found, the DB searching works fine. Could this be an issue where there's a confusion in where the script is looking for the .revCatfasta file?

I'm not very strong with coding, so any help would be appreciated!