LSSTDESC / rail_base

Base classes for RAIL
MIT License
0 stars 1 forks source link

Issue/10/tomo bins #18

Closed hangqianjun closed 1 year ago

hangqianjun commented 1 year ago

Change Description

Solution Description

In this PR I have added a tomographer module under src/rail/estimation. It contains two generic tomographer classes:

For each of these types, I've added an example classifier in algos/. naiveClassifierSRD is a PZtomographer that uses simple point estimate SRD binning; randomForestClassifier is a CatTomographer which is adapted from TXPipe.

Code Quality

Project-Specific Pull Request Checklists

Bug Fix Checklist

New Feature Checklist

Documentation Change Checklist

Build/CI Change Checklist

Other Change Checklist

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.19% :tada:

Comparison is base (c84ab6a) 95.86% compared to head (727298d) 96.05%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #18 +/- ## ========================================== + Coverage 95.86% 96.05% +0.19% ========================================== Files 29 32 +3 Lines 1643 1724 +81 ========================================== + Hits 1575 1656 +81 Misses 68 68 ``` | [Files Changed](https://app.codecov.io/gh/LSSTDESC/rail_base/pull/18?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LSSTDESC) | Coverage Δ | | |---|---|---| | [src/rail/estimation/algos/equal\_count.py](https://app.codecov.io/gh/LSSTDESC/rail_base/pull/18?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LSSTDESC#diff-c3JjL3JhaWwvZXN0aW1hdGlvbi9hbGdvcy9lcXVhbF9jb3VudC5weQ==) | `100.00% <100.00%> (ø)` | | | [src/rail/estimation/algos/uniform\_binning.py](https://app.codecov.io/gh/LSSTDESC/rail_base/pull/18?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LSSTDESC#diff-c3JjL3JhaWwvZXN0aW1hdGlvbi9hbGdvcy91bmlmb3JtX2Jpbm5pbmcucHk=) | `100.00% <100.00%> (ø)` | | | [src/rail/estimation/classifier.py](https://app.codecov.io/gh/LSSTDESC/rail_base/pull/18?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=LSSTDESC#diff-c3JjL3JhaWwvZXN0aW1hdGlvbi9jbGFzc2lmaWVyLnB5) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

hangqianjun commented 1 year ago

While trying to writing a unit test for the PZTomographer algorithm, I realised that there doesn't seem to be test qp.Ensemble data available in src/rail/examples_data/testdata/. Could we add a small test qp file like test_dc2_training_9816.hdf5?

eacharles commented 1 year ago

Depends on the size.   If it is more that a few 10’s of galaxies it would be better to download the data than to include it in the repo.On Jul 4, 2023, at 12:20 PM, hangqianjun @.***> wrote: While trying to writing a unit test for the PZTomographer algorithm, I realised that there doesn't seem to be test qp.Ensemble data available in src/rail/examples_data/testdata/. Could we add a small test qp file like test_dc2_training_9816.hdf5?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

hangqianjun commented 1 year ago

Depends on the size. If it is more that a few 10’s of galaxies it would be better to download the data than to include it in the repo.

Are these data available somewhere already?

eacharles commented 1 year ago

Does src/rail/examples_data/testdata/output_BPZ_lite.fits work? it is already in the repo.

sschmidt23 commented 1 year ago

@hangqianjun I was looking at this PR earlier, I don't think that we want to have input parameters in a .ini file unless there is a strong reason to do so, as that file could change between runs if someone pushes a change and updated values could lead to non-reproducibility on subsequent runs. Having all of the parameters as config params seems like a better way of tracking things in terms of reproducibility. Was there a reason to do this with a .ini file, e.g. maybe TXPipe does something like that?

Also, output_BPZ_lite.fits may not be the best test file, as the mode values for the first 10 galaxies are all either at z<0.2 or z>2.8, and thus the default tomographer value (for SRD binning) for all 10 is -99. We may need to include another small test file here with more appropriate mode values.

Other than that, this looks very nice!

hangqianjun commented 1 year ago

Hi @aimalz, are you happy with the module/algorithm names? Let me know if you have suggestions!