smith-chem-wisc / FlashLFQ

Ultra-fast label-free quantification algorithm for mass-spectrometry proteomics
GNU Lesser General Public License v3.0
19 stars 15 forks source link

crashed - invalid parametrization for the distribution #123

Closed rsalz closed 1 year ago

rsalz commented 1 year ago

I pulled the latest docker image and ran flashlfq with the --mbr --nor --bay --sha arguments. I have 30 mzml files. I get the following error after quantifying and MBR is completed:

Normalizing fractions
Normalizing bioreps and conditions
Normalizing techreps
Running Bayesian protein quantification analysis
FlashLFQ has crashed with the following error: One or more errors occurred. (Invalid parametrization fo
r the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the d
istribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distribu
tion.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distribution.)
(Invalid parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invali
d parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid param
etrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrizat
ion for the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for
 the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the di
stribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distribut
ion.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distribution.) (
Invalid parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid
 parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid parame
trization for the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrizati
on for the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for
the distribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the dis
tribution.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distributi
on.) (Invalid parametrization for the distribution.) (Invalid parametrization for the distribution.) (I
nvalid parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid
parametrization for the distribution.) (Invalid parametrization for the distribution.) (Invalid paramet
rization for the distribution.) (Invalid parametrization for the distribution.).
Error report written to /mnt/data
Unhandled exception. System.IO.FileNotFoundException: Could not load file or assembly 'Easy.Common, Ver
sion=4.3.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified.

File name: 'Easy.Common, Version=4.3.0.0, Culture=neutral, PublicKeyToken=null'
   at MzLibUtil.SystemInfo.InstalledRam()
   at MzLibUtil.SystemInfo.SystemProse()
   at MzLibUtil.SystemInfo.CompleteSystemInfo()
   at Util.OutputWriter.WriteErrorReport(Exception e, String inputPath, String outputPath) in C:\projects\flashlfq\Util\OutputWriter.cs:line 38
   at CMD.FlashLfqExecutable.Run(FlashLfqSettings settings) in C:\projects\flashlfq\CMD\FlashLFQExecutable.cs:line 240
   at CMD.FlashLfqExecutable.<>c.<Main>b__1_1(FlashLfqSettings options) in C:\projects\flashlfq\CMD\FlashLFQExecutable.cs:line 23
   at CommandLine.ParserResultExtensions.WithParsed[T](ParserResult`1 result, Action`1 action)
   at CMD.FlashLfqExecutable.Main(String[] args) in C:\projects\flashlfq\CMD\FlashLFQExecutable.cs:line 22
Aborted (core dumped)
trishorts commented 1 year ago

can you try running quant with 2 files using the same parameters? If you get the same error, that will help us troubleshoot the problem.

rsalz commented 1 year ago

it runs to completion without errors when i use only 2 files out of the 30

trishorts commented 1 year ago

that's good and bad. suggests a bad file, which is harder to spot in a large set. good in the sense that the program seems to be working fine. Let me share this error w/ a colleague before I send you off on another errand

rsalz commented 1 year ago

After a lot of trial and error i've identified the offending files. Though I don't know what's wrong with them... I am using the '-calib.mzml' files that are in the Task2 folder of my metamorpheus run. That metamorpheus ran to completion without problems so the only errors come up with flashlfq. Could you please help me find what the problem is? I still want to use the data...happy to send along some files to your email to help with debugging

rsalz commented 1 year ago

Originally, I was using a custom database in fasta format containing proteins from Gencode, some novel ORFs, and some fungal proteins. I used the same raw files and ran a new metamorpheus run searching only Uniprot human proteins + contaminants. When I run flashlfq with the same parameters as before on this output it runs to completion with all 30 files. I think the error has something to do with my search database...

trishorts commented 1 year ago

if you can provide the big database, I can see if it can be read successfully.