raphael-group / decifer

DeCiFer is an algorithm that simultaneously selects mutation multiplicities and clusters SNVs by their corresponding descendant cell fractions (DCF).
BSD 3-Clause "New" or "Revised" License
20 stars 7 forks source link

Error message from compute_pdfs() #22

Open lbresadola opened 2 years ago

lbresadola commented 2 years ago

Dear all,

when running DeCiFer (both releases 2.10 and 2.1.1) I get an error message, which was not there when using release 2.0.2. This is the end of the standard error file:

Progress: |█████████████████████████████-| 99.9% Complete [[2022-May-13 18:52:25]Completed 99 for k=11 [Iterations: 10]]
Progress: |██████████████████████████████| 100.0% Complete [[2022-May-13 18:52:25]Completed 99 for k=12 [Iterations: 7]]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/__main__.py", line 283, in CI
    grid = [objective(j, mut, s, bb) for j in np.linspace(0, PURITY[s], num_pts)]
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/__main__.py", line 283, in <listcomp>
    grid = [objective(j, mut, s, bb) for j in np.linspace(0, PURITY[s], num_pts)]
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/new_coordinate_ascent.py", line 139, in objective
    pdfs = compute_pdfs(*zip(*[form(m) for m in muti]), check=True)
TypeError: compute_pdfs() missing 3 required positional arguments: '_VS', '_AS', and '_BS'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/my_path/.conda/envs/decifer_v2.1.1/bin/decifer", line 11, in <module>
    sys.exit(main())
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/__main__.py", line 93, in main
    run_coordinator_iterative(mutations, sample_ids, num_samples, PURITY, args,
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/__main__.py", line 152, in run_coordinator_iterative
    CIs, PDFs = compute_CIs_mp(set(clus), bmut_HQ, num_samples, betabinomial, J, C, args['debug'], args['conservativeCIs'])
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/site-packages/decifer/__main__.py", line 256, in compute_CIs_mp
    for i in results:
  File "/my_path/.conda/envs/decifer_v2.1.1/lib/python3.9/multiprocessing/pool.py", line 870, in next
    raise value
TypeError: compute_pdfs() missing 3 required positional arguments: '_VS', '_AS', and '_BS'

This is the command I used:

decifer -p /input_path/decifer.purity.tsv /input_path/decifer.input.tsv --jobs 2 -K 12 --record --printallk --restarts 100 --output /output_path/DCF2.1.1_printtalk_K12

The output files seem to be there and they are not empty: *clusterCIs_K[4-9].tsv, *output_K[4-9].tsv, *Outliers_output_K[4-9].tsv, *_model_selection.tsv, but I am not sure if any additional result file should be there and is not produced because of this error.

When running DeCiFer 2.0.2, the .err file ended just like this, with no error:

Progress: |█████████████████████████████-| 99.9% Complete [[2022-Apr-13 15:20:19]Completed 99 for k=11 [Iterations: 8]]
Progress: |██████████████████████████████| 100.0% Complete [[2022-Apr-13 15:20:19]Completed 99 for k=10 [Iterations: 8]]

Could you kindly have a look at this?

Thank you very much! Best regards,

Luisa

brian-arnold commented 2 years ago

Hi Luisa, Thanks for your interest in DeCiFer and sorry you encountered this issue! It'd be most efficient if I could obtain your two input files so I can take a look at what's happening in debug mode, if you're alright with sharing them. Also, sometimes issue may be apparent just when glancing at input files. These files are what you specify above:

/input_path/decifer.purity.tsv /input_path/decifer.input.tsv

Thanks, and let us know if you have any additional questions. Sincerely, Brian

lbresadola commented 2 years ago

Hi Brian,

sure, here are the files: decifer.input.txt decifer.purity.txt

Thanks a lot for looking into this! Best,

Luisa

brian-arnold commented 2 years ago

Hi Luisa,

Apologies for the delayed reply. I found a fix for this, will update DeCiFer ASAP, and will let you know when the new release is out. But just so you know, this error arose when DeCiFer tried to calculate the confidence intervals for a cluster to which no mutations were assigned. This happened for one of the larger cluster values of K>10, and this error probably did not happen for earlier version of DeCiFer because we have recently implemented a new filtering feature that that likely threw out all the mutations for some clusters because they poorly fit the observed data.

I find it hard to believe that one would have power to detect ~K=11 distinct mutation clusters with just your two bulk samples, so when DeCiFer was fitting a model with K=11 clusters (within the range you specified), some clusters were likely assigned just a couple mutations, all of which were filtered out in a post-processing step.

The results for whatever DeCiFer managed to print should all be correct, but in any case this issue will disappear soon. Brian

lbresadola commented 2 years ago

Hi Brian,

thanks a lot for looking into this and for the explanation!

In this run I set -K 12 following the first point in the "Recommendations and quality control" paragraph of the readme, which suggests to initially set K to a number that is ~2-3 times as large as (number of samples for the patient)+2. Maybe this rule of thumb works better when analyzing a higher number of samples? Would you then suggest to use lower values of K when analyzing only two samples? However, the selected number of clusters in this run was 10, according to the model_selection.tsv file.

Thanks again! Best,

Luisa