statisticalbiotechnology / triqler

The triqler (TRansparent Identification-Quantification-linked Error Rates)'s source and example code
Apache License 2.0
19 stars 10 forks source link

Hyperparameter fitting fails for DIANN data #24

Closed JB91451 closed 1 year ago

JB91451 commented 1 year ago

Dear Metthew,

when I run the following command on data obtained from DIANN and converted with the diann2triqler script, it results in an error:

Issued command: triqler.py --decoy_pattern Entrapment_ C:\Tmp_Data\DIANN\SpikeIn\SpikeIn_Entrapment_Results_DIANN\SpikeIn_Entrapment_DIANN_Results_TriqlerInput.tsv
Parsing triqler input file
  Reading row 0
  Reading row 1000000
Calculating identification PEPs
  Identified 1040610 PSMs at 1% FDR
Selecting best feature per run and spectrum
  featureGroupIdx: 0
Dividing intensities by 1000 for increased readability
Calculating peptide-level identification PEPs
  Identified 43396 peptides at 1% FDR
Writing peptide quant rows to file: C:\Tmp_Data\DIANN\SpikeIn\SpikeIn_Entrapment_Results_DIANN\SpikeIn_Entrapment_DIANN_Results_TriqlerInput.tsv.pqr.tsv
Calculating protein-level identification PEPs
  Identified 2089 proteins at 1% FDR
Fitting hyperparameters
Traceback (most recent call last):
  File "C:\Programs\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Programs\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\USER\AppData\Roaming\Python\Python310\Scripts\triqler.exe\__main__.py", line 7, in <module>
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\triqler.py", line 39, in main
    runTriqler(params, args.in_file, args.out_file)
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\triqler.py", line 136, in runTriqler
    diff_exp.doDiffExp(params, peptQuantRows, triqlerOutputFile, doPickedProteinQuantification, selectComparisonBayesTmp, qvalMethod = qvalMethod)
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\diff_exp.py", line 16, in doDiffExp
    proteinOutputRows = proteinQuantificationMethod(peptQuantRows, params, proteinModifier, getEvalFeatures)
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\triqler.py", line 350, in doPickedProteinQuantification
    hyperparameters.fitPriors(peptQuantRows, params)
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\hyperparameters.py", line 87, in fitPriors
    fitLogitNormal(observedXICValues, params, plot) # old fitLogitNormal - missing value prior
  File "C:\Users\USER\AppData\Roaming\Python\Python310\site-packages\triqler\hyperparameters.py", line 112, in fitLogitNormal
    vals, bins = np.histogram(observedValues, bins = np.arange(minBin, maxBin, 0.1), normed = True)
  File "<__array_function__ internals>", line 198, in histogram
TypeError: histogram() got an unexpected keyword argument 'normed' 

I used an entrapment database as decoys for Triqler and let DIANN filter the output to 1% FDR (based on DIANNs decoys, not the entrapment database). Was this correct or could the issue be related to the FDR filtering?

My numpy version is 1.24.3

Best regards, Juergen

JB91451 commented 1 year ago

Dear Metthew,

I figured out that this error is due to the deprecated "normed" keyword in numpys histogram function in hyperparameters.py

Changing it in lines 112, 156 and 227 to "density" fixed the issue.

This is the help text from numpy 1.13: This keyword is deprecated in NumPy 1.6.0 due to confusing/buggy behavior. It will be removed in NumPy 2.0.0. Use the density keyword instead. If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that this latter behavior is known to be buggy with unequal bin widths; use density instead.

Best, Juergen

MatthewThe commented 1 year ago

Dear Juergen,

Thanks for reporting back and solving the issue. I will fix this soon, or alternatively you can create a pull request with your fixes.

-Matthew

JB91451 commented 1 year ago

Dear Matthew,

I just created a pull request. As I am not sure up to which version the density keyword is backward compatible, there is a try-except clause and the fix prefers normed. If you want to change it to make density the default in the future you may want to change the requirements accordingly.

Best, Juergen

MatthewThe commented 1 year ago

Great, thank you! I merged your pull request.

I will leave it with the try except clause to ensure backwards compatibility.