Nesvilab / FragPipe

A cross-platform proteomics data analysis suite
http://fragpipe.nesvilab.org
Other
184 stars 37 forks source link

MSFragger crashed due to two proteins have unreadable sequences #455

Closed tbaccata closed 3 years ago

tbaccata commented 3 years ago

Dear FragPipe developers,

I am reanalyzing a published data set (http://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD016455). When I try to run a LFQ-MBR workflow with (almost) default settings with the downloaded raw files MSFragger ends with a: java.lang.reflect.InvocationTargetException Caused by: io.grpc.StatusRuntimeException: UNKNOWN: Exception was thrown by handler before finishing the first search.

I thought maybe something was off with the ThermoRawFileParser so I converted the files to mzML format. I am running an ubuntu OS and I've managed that with mono, so there shouldn't be the problem. Now the first search finishes, but I get a: java.lang.ArrayIndexOutOfBoundsException Caused by: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: Index -20 out of bounds for length 27.

I should have the most recent versions installed from all dependencies. log_messages.zip

Thank you very much!

Best, Sebastian

fcyu commented 3 years ago

Can you send us your mzML and raw files?

Thanks,

Fengchao

tbaccata commented 3 years ago

Dear Fengchao,

thank you very much for your fast response! The raw files can be downloaded here: http://ftp.pride.ebi.ac.uk/pride/data/archive/2020/01/PXD016455/

The converted mzML files can be downloaded here: https://mega.nz/folder/nbIDAQwI#TCVJtVyO8lLf_2qN96QcAA

Just in case, the DB used for the search can be downloaded here: https://mega.nz/file/fOpziQwR#KNV7xiBWjCpHCuSIA56_uMVdr3ngYJqav0ijE5uWyK8

Best, Sebastian

fcyu commented 3 years ago

Two of the proteins (>sp|O60486|PLXC1_HUMAN and >rev_sp|O60486|PLXC1_HUMAN ) in your DB are corrupted, which causes the crash:

image

Best,

Fengchao