Nesvilab / philosopher

PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering
https://philosopher.nesvilab.org
GNU General Public License v3.0
110 stars 18 forks source link

PhilisopherFilter error: cannot decode packed binary. EOF #260

Closed JoleinGloerich closed 3 years ago

JoleinGloerich commented 3 years ago

Description

I'm trying to run fragpipe on timsTOF Pro data for MSFragger database searches and IonQuant LFQ quantitation. I get an error at the PhilosopherFilter step.

I'm running on a Windows 10 computer with fragpipe v16.0 MSFragger v3.3 Philosopher v4.0.0 Python v3.8.8

log extract

PhilosopherFilter [Work dir: E:\PROCESSED_DATA\MSF_HY_MSsettings_test_complete\A_dil_ICC1201361] C:\MyPrograms\FragPipe-jre-16.0\fragpipe\tools\philosopher\philosopher.exe filter --sequential --razor --prot 0.01 --tag rev --pepxml E:\PROCESSED_DATA\MSF_HY_MSsettings_test_complete\A_dil_ICC120_1361 --protPhilazor.bin Process 'PhilosopherFilter' finished, exit code: 1 INFO[17:28:18] Executing Filter v4.0.0
INFO[17:28:18] Processing peptide identification files
FATA[17:28:18] Cannot decode packed binary. EOF
Process returned non-zero exit code, stopping

I'm not sure what is happening here, here is the full log of my analysis log_2021-08-03_17-29-25.txt

Here is the pepXML file of the run where the error occurred. H150_Y0_2xdil_A_ICC120M_S2-C1_1_1361.zip

I hope that you can help me to solve this program, I really like your pipeline!

Thanks,

Jolein

prvst commented 3 years ago

There seems to be something wrong with your FASTA file, it looks like you have sequences composed by a code incompatible with amino acids, including special characters. The message below is from ProteinProphet:

WARNING: Trying to compute mass of non-residue: <
WARNING: Trying to compute mass of non-residue: !
WARNING: Trying to compute mass of non-residue: !
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue: h
WARNING: Trying to compute mass of non-residue: h
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: m
WARNING: Trying to compute mass of non-residue: m
WARNING: Trying to compute mass of non-residue: l
WARNING: Trying to compute mass of non-residue: l
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue:  
WARNING: Trying to compute mass of non-residue: "
WARNING: Trying to compute mass of non-residue: "
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: b
WARNING: Trying to compute mass of non-residue: b
WARNING: Trying to compute mass of non-residue: o
WARNING: Trying to compute mass of non-residue: o
WARNING: Trying to compute mass of non-residue: u
WARNING: Trying to compute mass of non-residue: u
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: :
WARNING: Trying to compute mass of non-residue: :
WARNING: Trying to compute mass of non-residue: l
WARNING: Trying to compute mass of non-residue: l
WARNING: Trying to compute mass of non-residue: e
WARNING: Trying to compute mass of non-residue: e
WARNING: Trying to compute mass of non-residue: g
WARNING: Trying to compute mass of non-residue: g
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: y
WARNING: Trying to compute mass of non-residue: y
WARNING: Trying to compute mass of non-residue: -
WARNING: Trying to compute mass of non-residue: -
WARNING: Trying to compute mass of non-residue: o
WARNING: Trying to compute mass of non-residue: o
WARNING: Trying to compute mass of non-residue: m
WARNING: Trying to compute mass of non-residue: m
WARNING: Trying to compute mass of non-residue: p
WARNING: Trying to compute mass of non-residue: p
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: a
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: t
WARNING: Trying to compute mass of non-residue: "
WARNING: Trying to compute mass of non-residue: "

I suggest that you get a new protein FASTA, or clean yours before trying again. Also, please remove space characters from file names: E\:\DATABASES\2021-08-03-decoys-reviewed-contam-UP000005640- UP000002311.fas

JoleinGloerich commented 3 years ago

I downloaded the database using fragpipe/Philosopher in the GUI, now I deleted any space characters from the FASTA filename. I think now the file should be okay? [2021-08-06-decoys-reviewed-contam-UP000005640-UP000002311.zip]

However, I'm still getting the same error as before: (https://github.com/Nesvilab/philosopher/files/6944629/2021-08-06-decoys-reviewed-contam-UP000005640-UP000002311.zip) log_2021-08-06_11-37-58.txt

prvst commented 3 years ago

I can't find any issues with your run, could you send me the pep.xml files from the folder A_dil_ICC120_1361?

phusen commented 3 years ago

This sounds exactly like my issue, except for me, it provides a bit more info in the end:

time="14:47:49" level=info msg="Executing Filter  v4.0.0"
time="14:47:49" level=info msg="Processing peptide identification files"
time="14:47:49" level=fatal msg="Cannot decode packed binary. strconv.ParseFloat: parsing \"0,104952\": invalid syntax"

So it looks like it used a comma as a decimal separator when writing the file interact-UPS1_12500amol_R1.pep.xml earlier - presumably because my laptop is configured with Danish regional settings. I tried again after making sure, I had LC_NUMERIC=en_US.UTF-8, but the same problem appeared again. I will try now with LC_ALL=C and LANG=C. You seem to be using Windows, so try to fiddle with your regional settings in the control panel, or wherever they are now. Or maybe first check if you have commas for decimal separators in your "interact" files.

I would tend to think, though, that this should work no matter the regional settings, and most other files seem to be written with dots in stead of commas.

log_2021-08-16_15-12-41.txt

prvst commented 3 years ago

@phusen Your problem seems to be related to your system language. Since you are running FragPipe, I suggest you open a ticket in the FragPipe issues, and ask there how to change this, and properly run the program.

phusen commented 3 years ago

Just an update: I can confirm that it completed successfully using LC_ALL=C and LANG=C.

@prvst Thanks, I will probably do that. But it is my impression that Philosopher both wrote and tried to read the file that ended up with commas as decimal separators in one place in the file (correct me if I'm wrong, I'm a bit new to this). I would argue that, ideally, this should not break due to the locale settings of the system. I.e., changing theses settings is a bit of a workaround. But maybe you disagree? (or I got it wrong)

prvst commented 3 years ago

The program will write the values using your system settings (locale, language, etc.), and it will read them expecting them to be in the right format, that is why you see the issue.

phusen commented 3 years ago

Yeah, after more digging, I see that the "offending" file was produced by code from FragPipe, so I opened a ticket there: https://github.com/Nesvilab/FragPipe/issues/432 I don't know if this is the same problem, @JoleinGloerich was having, but it seems to happen at the same place. Thanks again.

prvst commented 3 years ago

closing for lack of response