UBC-Stat-ML / fitclone

14 stars 3 forks source link

NaN's during OutterPGAS iterations #17

Closed pcream closed 4 months ago

pcream commented 5 months ago

I have had some trouble getting certain data to run through fitclone for unknown reasons. I can get the .tsv/.yaml labeled "NMS_bottleneck_A" to run perfectly fine and it outputs reasonable fitness values given the clonal composition. But if I alter some of the percentages of certain clones as in "NMS_bottleneck_D" (without changing clone number/timepoints), suddenly I get a failure related to NaN's during an interation of OutterPGAS. It's not clear what's causing this, I have also tried other clonal compositions varying in number and frequency and have had other errors related to index errors. Really unsure what could be causing these errors and why changes in frequency would break the the whole pipeline.

Many thanks!

./NMS_fitclone.py:34: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
{'K': 1, 'K_prime': 4, 'MCMC_in_Gibbs_nIter': 20, 'disable_ancestor_bridge': True, 'Ne': 500, 'bridge_n_cores': 1, 'do_predict': 0, 'gp_epsilon': 0.005, 'obs_num': 4, 'gp_n_opt_restarts': 20, 'h': 0.05, 'infer_epsilon': 0.01, 'infer_epsilon_tolerance': 0, 'inference_n_iter': 10000, 'learn_time': 1, 'lower_s': -10, 'n_cores': 12, 'original_data': '../data/NMS_bottleneck_D.tsv', 'out_path': '../results/', 'pf_n_particles': 5000, 'pf_n_theta': 500, 'pgas_n_particles': 5000, 'proposal_step_sigma': [0.025, 0.025, 0.025, 0.025], 'seed': 2, 'true_x0': [0.42112125868690475, 0.10256939082931406, 0.029992607701921473, 0.19326294873370334], 'upper_s': 10}
../results/_GPP7H_202405-20-154950.867936
self.h is [0.05 0.05 0.05 0.05]
Warning! NOT NORMALISING VALUES TO COUNTS...
OutterPGAS: time_length, tau = 1.0 20
OutterPGAS: sample: self.tau=20
On iteration 2/10000 --- OutterPGAS --- 0:00:00. -- (llhood = -inf )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
On iteration 3/10000 --- OutterPGAS --- 0:00:00. -- (llhood = -331819.08449738746 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 1
# of particles passed through the epsilon ball = 1
On iteration 4/10000 --- OutterPGAS --- 0:00:00. -- (llhood = -351292.53020089585 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
On iteration 5/10000 --- OutterPGAS --- 0:00:01. -- (llhood = -335819.33546219766 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
On iteration 6/10000 --- OutterPGAS --- 0:00:01. -- (llhood = -320895.32604131196 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
On iteration 7/10000 --- OutterPGAS --- 0:00:01. -- (llhood = -316850.2739594922 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 5000
On iteration 8/10000 --- OutterPGAS --- 0:00:02. -- (llhood = -201094.18156311114 )
# of particles passed through the epsilon ball = 5000
# of particles passed through the epsilon ball = 0
./NMS_fitclone.py:379: RuntimeWarning: invalid value encountered in true_divide
Traceback (most recent call last):
  File "./NMS_fitclone.py", line 8, in <module>
    CondExp().run_with_config_file('../data/NMS_bottleneck_D.yaml')
  File "<string>", line 36, in run_with_config_file
  File "<string>", line 637, in run
  File "<string>", line 223, in logic
  File "<string>", line 101, in _infer
  File "<string>", line 259, in sample
  File "<string>", line 339, in sample_non_parallel
  File "mtrand.pyx", line 1144, in mtrand.RandomState.choice
Traceback (most recent call last):
  File "./NMS_fitclone.py", line 8, in <module>
    CondExp().run_with_config_file('../data/NMS_bottleneck_B.yaml')
  File "<string>", line 36, in run_with_config_file
  File "<string>", line 637, in run
  File "<string>", line 223, in logic
  File "<string>", line 101, in _infer
  File "<string>", line 260, in sample
  File "<string>", line 343, in sample
  File "<string>", line 319, in sample
IndexError: index 4 is out of bounds for axis 0 with size 4

NMS_bottleneck_D.txt NMS_bottleneck_D_yaml.txt NMS_bottleneck_A.txt NMS_bottleneck_A_yaml.txt

pcream commented 4 months ago

Ok, I've found that I can prevent the error from occurring by adjusting the seed number up (2 to 3/4). Not sure why that happens, or how that might impact results, but I assume it is ok if there is a burn-in period.

The second index error is related to not having a proposal_step_sigma with enough "terms" to run with more clones than the default 4. Again, not sure how that effects the run by having more than 4.