AlexandrovLab / SigProfilerAssignment

Assignment of known mutational signatures to individual samples and individual somatic mutations
BSD 2-Clause "Simplified" License
45 stars 10 forks source link

Decompose fit: causes an error `ValueError: cannot convert float NaN to integer` #72

Closed monagai closed 1 year ago

monagai commented 1 year ago

Decompose fit causes an error ValueError: cannot convert float NaN to integer

[Environment]

I set the sample and signatures to results from SigProfilerExtractor. Did I set wront values?

[/.../vcf/output/SBS/vcf.SBS96.all for sample]

MutationType    A01     A02     A03     ...
A[C>A]A         240 1   0   ...
A[C>A]C         127 1   0   ...
A[C>A]G         24  2   0   ...
...

[/.../Extract_SBS96/SBS96/All_Solutions/SBS96_10_Signatures/Signatures/SBS96_S10_Signatures.txthead signature_f for sinagures]

MutationType    SBS96A  SBS96B  SBS96C  SBS96D  SBS96E  SBS96F  SBS96G  SBS96H  SBS96I  SBS96J
A[C>A]A 0.0035979740925245096   0.023002376994118095    0.015547274604905396    0.03080121469683945 0.00589088752749376 0.019222156193864068    0.015870585383381694    0.003114972859309546    0.025506166475950068    0.014738297902513296
A[C>A]C 0.0019327878194973641   0.013681165443267673    0.008295955149806104    0.015346990535035729    0.0063715640525333584   0.008117534789489582    0.008902062547858804    0.0030881216739641814   0.01301863907269677     0.00995758994598873
...

[log]

sinagure_f=/.../Extract_SBS96/SBS96/All_Solutions/SBS96_10_Signatures/Signatures/SBS96_S10_Signatures.txt
sample_f=/.../vcf/output/SBS/vcf.SBS96.all
out_d=/...

 Decomposing De Novo Signatures  .....
Decompositon Plot:SBS96A |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (6957.59/s) 
Decompositon Plot:SBS96B |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (7715.18/s) 
Decompositon Plot:SBS96C |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (7969.36/s) 
Decompositon Plot:SBS96D |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8494.20/s) 
Decompositon Plot:SBS96E |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8066.96/s) 
Decompositon Plot:SBS96F |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8228.27/s) 
Decompositon Plot:SBS96G |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8246.99/s) 
Decompositon Plot:SBS96H |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8059.38/s) 
Decompositon Plot:SBS96I |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (8101.28/s) 
Decompositon Plot:SBS96J |▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉▉| 1/1 [100%] in 0.0s (7939.15/s) 

 Assigning decomposed signature
|██████████▉⚠︎                 | (!) 3/11 [27%] in 0.1s (254.17/s) 
Traceback (most recent call last):
  File "/.../script/Decompose.py", line 22, in <module>
    Analyze.decompose_fit(sample_f, 
  File "/.../lib/python3.9/site-packages/SigProfilerAssignment/Analyzer.py", line 5, in decompose_fit
    decomp.spa_analyze(samples=samples,  output=output, signatures=signatures, signature_database=signature_database,nnls_add_penalty=nnls_add_penalty, nnls_remove_penalty=nnls_remove_penalty, initial_remove_penalty=initial_remove_penalty,genome_build=genome_build, cosmic_version=cosmic_version, make_plots=make_plots, collapse_to_SBS96=collapse_to_SBS96,connected_sigs=connected_sigs, verbose=verbose,decompose_fit_option= True,denovo_refit_option=False,cosmic_fit_option=False,devopts=devopts,new_signature_thresh_hold=new_signature_thresh_hold,exclude_signature_subgroups=exclude_signature_subgroups,exome=exome,input_type=input_type,context_type=context_type,export_probabilities=export_probabilities, export_probabilities_per_mutation=export_probabilities_per_mutation)
  File "/.../lib/python3.9/site-packages/SigProfilerAssignment/decomposition.py", line 523, in spa_analyze
    result = sub.make_final_solution(processAvg, genomes, allsigids, layer_directory2, mutation_type, index, colnames, 
  File "/.../lib/python3.9/site-packages/SigProfilerAssignment/decompose_subroutines.py", line 570, in make_final_solution
    newExposure, newSimilarity = ss.fit_signatures(fit_signatures, allgenomes[:,r])
  File "/.../lib/python3.9/site-packages/SigProfilerAssignment/single_sample.py", line 146, in fit_signatures
    newExposure[idxmaxcoef] = round(newExposure[idxmaxcoef])+maxmutation-sum(newExposure)
ValueError: cannot convert float NaN to integer
monagai commented 1 year ago

Sorry, I found that there was 1 sample without mutation. (It means that there is a column with only 0 values in the matrix.) I removed the sample from the data, then decompose fit works.