nuclear-multimessenger-astronomy / nmma

A pythonic library for probing nuclear physics and cosmology with multimessenger analysis
https://nuclear-multimessenger-astronomy.github.io/nmma/
GNU General Public License v3.0
30 stars 58 forks source link

Restrict scikit-learn version to <1.2 #231

Closed bfhealy closed 1 year ago

bfhealy commented 1 year ago

This PR restricts the scikit-learn version in requirements.txt to <1.2 (in addition to >=1.0.2). Assuming the tests pass with the different package version, this resolves #173. The .joblib files loaded when setting --ztf-uncertainties and --ztf-sampling were likely generated using an older version of scikit-learn, raising unusual errors when attempting to load them with newer versions of the package. A longer-term solution will be to regenerate those files with newer code so we don't need to constrain the package version.

tylerbarna commented 1 year ago

does this have any impact on the version of numpy that is installed? I know I've encountered a couple of warnings when setting up a new nmma environment regarding the compatibility of scikit-learn and numpy versions

mcoughlin commented 1 year ago

@bfhealy @tylerbarna I will check with Ari about getting these updated, but for now, this seems good.

tylerbarna commented 1 year ago

@bfhealy do you know if there's any compatibility issues with numpy/pandas?

bfhealy commented 1 year ago

@tylerbarna So far I haven't encountered any. I set up a new environment today and didn't get any related warnings/errors.

tylerbarna commented 1 year ago

@bfhealy interesting, what python version did you set up your environment with, and did you use pip or conda for installing parallel-bilby?

tylerbarna commented 1 year ago

I'm getting a new, fun error relating to pandas when trying to generate some lightcurves from an injection of like 100. I was able to resolve the version warning by pinning numpy to 1.22.4 (at least on python 3.10), but I hit the following error:

No injection files provided, will generate injection based on the prior file provided only
17:19 bilby_pipe INFO    : Created injection file ./lightcurves/nugent-hyper.json
Traceback (most recent call last):
  File "/home/tbarna/anaconda3/envs/nmma_env/bin/light_curve_generation", line 8, in <module>
    sys.exit(main())
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/create_lightcurves.py", line 333, in main
    data = create_light_curve_data(
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/injection.py", line 74, in create_light_curve_data
    ztfuncer = load(f)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 648, in load
    obj = _unpickle(fobj)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/joblib/numpy_pickle.py", line 577, in _unpickle
    obj = unpickler.load()
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/pickle.py", line 1590, in load_reduce
    stack[-1] = func(*args)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
    return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)
bfhealy commented 1 year ago

@tylerbarna I used python 3.10 and conda installed parallel-bilby (version 2.0.2). My numpy version is 1.24.3, and pandas is 2.1.0.

Could you please share the commands you ran so I can give them a try?

tylerbarna commented 1 year ago

@bfhealy here's the repo, should work by just running the script with the nmma environment active https://github.com/tylerbarna/nmma-model-recovery

tylerbarna commented 1 year ago

@bfhealy were you able to try running it? It occurs to me it depends on there being another repo for the priors, which is https://github.com/tylerbarna/dsmma_kn_23/tree/main/priors

I just pushed a commit that should include a copy of the priors inside the nmma-model-recovery repo so there isn't that dependency

bfhealy commented 1 year ago

@tylerbarna Thanks for adding the priors. While running your script I encountered the same TypeError you shared above. I don't get the error if I comment --ztf-uncertainties, so I think once again this has to do with the pickled ZTF-related files we plan to update. For now I was able to get things to work by downgrading pandas to 1.5.3 (pip install 'pandas<2.0').

tylerbarna commented 1 year ago

@bfhealy would you mind pulling one more time and running a basic analysis on one of the generated lightcurves I pushed, something like

light_curve_analysis --data lightcurves/nugent-hyper_0.json --model nugent-hyper --prior priors/nugent-hyper.prior --remove-nondetections --trigger-time 44244

I've been encountering an issue with my ev.dat being empty and haven't been able to figure out if it's an issue with the environment or something in the way I'm generating the lightcurves

bfhealy commented 1 year ago

@tylerbarna I ran that command and sampling completed successfully. I got an ev.dat file that's 2MB in size. Perhaps a fresh environment installation would help?

I do get several warnings about a change in prior name: Warning: the 'KNtimeshift' parameter is deprecated as of nmma 0.0.19, please update your prior to use 'timeshift' instead

tylerbarna commented 1 year ago

@bfhealy very odd, I've encountered this same issue on two different systems now, both MSI and my local PC (WSL).

12:06 bilby INFO    : Using temporary file /tmp/tmpq58ivb8q
 *****************************************************
 MultiNest v3.10
 Copyright Farhan Feroz & Mike Hobson
 Release Jul 2015

 no. of live points = 2048
 dimensionality =    4
 resuming from previous job
 *****************************************************
 Starting MultiNest
Acceptance Rate:                        1.000000
Replacements:                               2048
Total Samples:                              2048
Nested Sampling ln(Z):            **************
12:06 bilby INFO    : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/
 ln(ev)=  -7.0874296015377425E-016 +/-   5.8827365954340079E-010
 Total Likelihood Evaluations:         2048
 Sampling finished. Exiting MultiNest
  analysing data from /tmp/tmpq58ivb8q/.txt
12:06 bilby INFO    : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/
/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py:193: UserWarning: genfromtxt: Empty input file: "outdir/pm_injection//ev.dat"
  dead_points = np.genfromtxt(dir_ + "/ev.dat")
Traceback (most recent call last):
  File "/home/tbarna/anaconda3/envs/nmma_env/bin/light_curve_analysis", line 8, in <module>
    sys.exit(main())
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/analysis.py", line 909, in main
    analysis(args)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/nmma/em/analysis.py", line 655, in analysis
    result = bilby.run_sampler(
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
    result = sampler.run_sampler()
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/base_sampler.py", line 97, in wrapped
    output = method(self, *args, **kwargs)
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py", line 178, in run_sampler
    self.result.nested_samples = self._nested_samples
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/bilby/core/sampler/pymultinest.py", line 201, in _nested_samples
    np.vstack([dead_points, live_points]).copy(),
  File "<__array_function__ internals>", line 180, in vstack
  File "/home/tbarna/anaconda3/envs/nmma_env/lib/python3.10/site-packages/numpy/core/shape_base.py", line 282, in vstack
    return _nx.concatenate(arrs, 0)
  File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 7
bfhealy commented 1 year ago

@tylerbarna Hmm, since the issue is with the sampling I wonder if it has to do with pymultinest. There was a recent new release of version 2.12 on pypi. Maybe it's worth upgrading that if you haven't already?

tylerbarna commented 1 year ago

@bfhealy just checked, looks like pymultinest is already on version 2.12

tylerbarna commented 1 year ago

here's the output I'm getting from conda list. @bfhealy could you print out your versions so we can figure out which packages might be on different versions?

mcoughlin commented 1 year ago

12:06 bilby INFO : Using temporary file /tmp/tmpq58ivb8q


MultiNest v3.10 Copyright Farhan Feroz & Mike Hobson Release Jul 2015

no. of live points = 2048 dimensionality = 4 resuming from previous job


Starting MultiNest Acceptance Rate: 1.000000 Replacements: 2048 Total Samples: 2048 Nested Sampling ln(Z): ** 12:06 bilby INFO : Overwriting outdir/pm_injection/ with /tmp/tmpq58ivb8q/ ln(ev)= -7.0874296015377425E-016 +/- 5.8827365954340079E-010 Total Likelihood Evaluations: 2048 Sampling finished. Exiting MultiNest

This resuming from previous job seems pretty weird.