stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction
Apache License 2.0
1.64k stars 215 forks source link

update dependencies #227

Closed merl-dev closed 3 years ago

acturner commented 3 years ago

Any update on this - trying to use poetry to install packages that rely on sklearn >= 0.24 is currently impossible.

ryan-wolbeck commented 3 years ago

Any update on this - trying to use poetry to install packages that rely on sklearn >= 0.24 is currently impossible.

It's currently less than .24 are you concerned that merging this will cause that break? I'm waiting for @merl-dev to resolve the conflicts

ryan-wolbeck commented 3 years ago

@acturner

acturner commented 3 years ago

@ryan-wolbeck No concerns with merging this - one can't currently use poetry to install both ngboost and packages with sklearn >= 0.24, so I'm an advocate of this PR.

ryan-wolbeck commented 3 years ago

I resolved the conflicts, waiting for checks to pass then we can merge

ryan-wolbeck commented 3 years ago

@alejandroschuler looks like we are getting some failed tests

tests/test_distns.py::test_dists_runs_on_examples_logscore[learner1-T]
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:5936: RuntimeWarning: overflow encountered in multiply
    lPx -= 0.5*np.log(r*np.pi) + (r+1)/2*np.log(1+(x**2)/r)

tests/test_distns.py: 609 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/t.py:126: RuntimeWarning: overflow encountered in square
    self.var = self.scale ** 2

tests/test_distns.py: 256 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/t.py:125: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 492 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:5936: RuntimeWarning: overflow encountered in square
    lPx -= 0.5*np.log(r*np.pi) + (r+1)/2*np.log(1+(x**2)/r)

tests/test_distns.py: 42 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1802: RuntimeWarning: overflow encountered in true_divide
    x = np.asarray((x - loc)/scale, dtype=dtyp)

tests/test_distns.py: 259 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1802: RuntimeWarning: divide by zero encountered in true_divide
    x = np.asarray((x - loc)/scale, dtype=dtyp)

tests/test_distns.py: 305 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:243: RuntimeWarning: overflow encountered in square
    return np.exp(-x**2/2.0) / _norm_pdf_C

tests/test_distns.py: 349 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:70: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 401 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:71: RuntimeWarning: overflow encountered in square
    self.var = self.scale ** 2

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:29: RuntimeWarning: divide by zero encountered in true_divide
    Z = (Y - self.loc) / self.scale

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:33: RuntimeWarning: invalid value encountered in multiply
    - 1 / np.sqrt(np.pi)

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-LogNormal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-LogNormal]
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/sklearn/utils/validation.py:63: FutureWarning: Arrays of bytes/strings is being converted to decimal numbers if dtype='numeric'. This behavior is deprecated in 0.24 and will be removed in 1.1 (renaming of 0.26). Please convert your data to numeric values explicitly instead.
    return f(*args, **kwargs)

tests/test_distns.py: 566 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:108: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 92 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:52: RuntimeWarning: overflow encountered in true_divide
    Z = (lT - self.loc) / self.scale

tests/test_distns.py: 568 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:64: RuntimeWarning: invalid value encountered in multiply
    return (1 - E) * crps_cens + E * crps_uncens

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:52: RuntimeWarning: divide by zero encountered in true_divide
    Z = (lT - self.loc) / self.scale

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:57: RuntimeWarning: invalid value encountered in multiply
    - 1 / np.sqrt(np.pi)

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:62: RuntimeWarning: invalid value encountered in multiply
    - sp.stats.norm.cdf(np.sqrt(2) * Z) / np.sqrt(np.pi)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-LogNormal]
FAILED tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-LogNormal]
=========== 2 failed, 63 passed, 7079 warnings in 289.85s (0:04:49) ============
make: *** [Makefile:15: test] Error 1
Error: Process completed with exit code 2.

Fairly hard to decipher what is actually going on and if these warnings as a result of the dependency changes or existing issues now being highlighted again like #61

alejandroschuler commented 3 years ago

@alejandroschuler looks like we are getting some failed tests

tests/test_distns.py::test_dists_runs_on_examples_logscore[learner1-T]
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:5936: RuntimeWarning: overflow encountered in multiply
    lPx -= 0.5*np.log(r*np.pi) + (r+1)/2*np.log(1+(x**2)/r)

tests/test_distns.py: 609 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/t.py:126: RuntimeWarning: overflow encountered in square
    self.var = self.scale ** 2

tests/test_distns.py: 256 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/t.py:125: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 492 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:5936: RuntimeWarning: overflow encountered in square
    lPx -= 0.5*np.log(r*np.pi) + (r+1)/2*np.log(1+(x**2)/r)

tests/test_distns.py: 42 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1802: RuntimeWarning: overflow encountered in true_divide
    x = np.asarray((x - loc)/scale, dtype=dtyp)

tests/test_distns.py: 259 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py:1802: RuntimeWarning: divide by zero encountered in true_divide
    x = np.asarray((x - loc)/scale, dtype=dtyp)

tests/test_distns.py: 305 warnings
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/scipy/stats/_continuous_distns.py:243: RuntimeWarning: overflow encountered in square
    return np.exp(-x**2/2.0) / _norm_pdf_C

tests/test_distns.py: 349 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:70: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 401 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:71: RuntimeWarning: overflow encountered in square
    self.var = self.scale ** 2

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:29: RuntimeWarning: divide by zero encountered in true_divide
    Z = (Y - self.loc) / self.scale

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-Normal]
  /home/runner/work/ngboost/ngboost/ngboost/distns/normal.py:33: RuntimeWarning: invalid value encountered in multiply
    - 1 / np.sqrt(np.pi)

tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-LogNormal]
tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-LogNormal]
  /home/runner/.cache/pypoetry/virtualenvs/ngboost-ShN0YL_F-py3.6/lib/python3.6/site-packages/sklearn/utils/validation.py:63: FutureWarning: Arrays of bytes/strings is being converted to decimal numbers if dtype='numeric'. This behavior is deprecated in 0.24 and will be removed in 1.1 (renaming of 0.26). Please convert your data to numeric values explicitly instead.
    return f(*args, **kwargs)

tests/test_distns.py: 566 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:108: RuntimeWarning: overflow encountered in exp
    self.scale = np.exp(params[1])

tests/test_distns.py: 92 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:52: RuntimeWarning: overflow encountered in true_divide
    Z = (lT - self.loc) / self.scale

tests/test_distns.py: 568 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:64: RuntimeWarning: invalid value encountered in multiply
    return (1 - E) * crps_cens + E * crps_uncens

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:52: RuntimeWarning: divide by zero encountered in true_divide
    Z = (lT - self.loc) / self.scale

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:57: RuntimeWarning: invalid value encountered in multiply
    - 1 / np.sqrt(np.pi)

tests/test_distns.py: 290 warnings
  /home/runner/work/ngboost/ngboost/ngboost/distns/lognormal.py:62: RuntimeWarning: invalid value encountered in multiply
    - sp.stats.norm.cdf(np.sqrt(2) * Z) / np.sqrt(np.pi)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner0-LogNormal]
FAILED tests/test_distns.py::test_dists_runs_on_examples_crpscore[learner1-LogNormal]
=========== 2 failed, 63 passed, 7079 warnings in 289.85s (0:04:49) ============
make: *** [Makefile:15: test] Error 1
Error: Process completed with exit code 2.

Fairly hard to decipher what is actually going on and if these warnings as a result of the dependency changes or existing issues now being highlighted again like #61

Yeah I think it's the latter. Not sure what to do about it tbh

MikeOMa commented 3 years ago

I was a bit scared when I seen this as I was the last person to work on the censored part when I made it pickable. I was worried I may have made a mistake copying the Y_from_censored code over. I looked into it a bit.

I think the issue is when you run check_array(T, ensure_2d=False) on a numpy array T with

T.dtype == [
        ("Event", "?"),
        ("Time", "<f8"),
    ]

Maybe change the first part of ngboost.helpers Y_from_censored to:

    if T is None:
        return None
    elif T.dtype == [
        ("Event", "?"),
        ("Time", "<f8"),
    ]:  # already processed. Necessary for when d_score() calls score() as in LogNormalCRPScore
        return T
    else:
        T = check_array(T, ensure_2d=False)
        T = T.reshape(T.shape[0])

This does make the test pass. I do not know how to push this into an existing pull request however! I also don't know if the above would break anything else.

Edit: I can try to push the above soon if nobody objects! I just don't have time to try right now. Feel free to do it if you want it fixed sooner anybody!

ryan-wolbeck commented 3 years ago

@MikeOMa thanks for this! I'll take a look at it later today or tomorrow

ryan-wolbeck commented 3 years ago

Alright, I think this PR is ready to go now. @MikeOMa want to take a look?