LSSTDESC / rail_tpz

RAIL-wrapped version of a "lite" version of Matias Carrasco-Kind's TPZ tree-based photo-z code
MIT License
0 stars 0 forks source link

Track down intermittent error in tree logic in rail_tpz #4

Closed sschmidt23 closed 11 months ago

sschmidt23 commented 11 months ago

There is an intermittent test failure in rail_tpz that I think is caused by an uncaught edge case in the tree search logic. Occasionally we get an

AttributeError: 'numpy.ndarray' object has no attribute 'dim'

at L613 in TPZ.py https://github.com/LSSTDESC/rail_tpz/blob/aff3d58f7ee670e8b4812b09f93bc35995dd5552/src/rail/estimation/algos/ml_codes/TPZ.py#L613C2-L613C2 sd = node.dim

Here is the full stack trace:

>       results, rerun_results, _ = one_algo("TPz_lite", train_algo, pz_algo,
                                             train_config_dict, estim_config_dict)

tests/test_tpz.py:44: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/rail/core/algo_utils.py:34: in one_algo
    estim = pz.estimate(validation_data)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/rail/estimation/estimator.py:96: in estimate
    self.run()
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/rail/estimation/estimator.py:110: in run
    self._process_chunk(s, e, test_data, first)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/rail/estimation/algos/tpz_lite.py:270: in _process_chunk
    temp = S.get_vals(Test.X[i])
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/rail/estimation/algos/ml_codes/TPZ.py:472: in get_vals
    out = search(line, self.root)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

line = array([24.86733437, 23.61107445, 22.98801994, 22.69544411, 22.58488274,
       22.46894836])
node = array([0.15190487, 0.10387819, 0.13813334, 0.14970961, 0.15091318,
       0.10252265, 0.13336275, 0.1185175 , 0.104033...72, 0.11283907, 0.1536378 , 0.13871538, 0.12140052,
       0.14970961, 0.15190487, 0.10252265, 0.14841769, 0.14604992])

    def search(line, node):
>       sd = node.dim
E       AttributeError: 'numpy.ndarray' object has no attribute 'dim'

I think node should be an InsertNode object (see L152 for the class), which should have .dim assigned. Not sure why this pops up, will need to do a detailed trace through the tree logic in the near future.

sschmidt23 commented 11 months ago

And I should add, I have not encountered this error in running with my larger 10k galaxy training set, this may just be related to the very small 100 galaxy training set that we use in the tests. First thing to try is to swap out the 100 galaxy training with the 10,000 galaxy training set and see if the error goes away.

sschmidt23 commented 11 months ago

closing after creating duplicate issue as the secret wasn't set to keep track of this one properly.