statsmodels / statsmodels

Statsmodels: statistical modeling and econometrics in Python
http://www.statsmodels.org/devel/
BSD 3-Clause "New" or "Revised" License
10.15k stars 2.89k forks source link

REL: release 0.8 - changes 0.8rc1 #3075

Closed josef-pkt closed 5 years ago

josef-pkt commented 8 years ago

an issue to keep track of merges during rc1 cross-link #2176 has the notes for the rc1 release preparation

3074 forward ports 3 commits made to maintenance before release

additional changes will be made to master and backported (cherry-picked) to maintenance/0.8.x

extra

currently 28 PR closed, 3 backported, 25 in merge backport

some doc changes are still missing, not yet in master

matthew-brett commented 8 years ago

Backport of test precision fix : https://github.com/statsmodels/statsmodels/pull/3093

josef-pkt commented 8 years ago

list of merge commits in master git log --merges origin/master --pretty=oneline --abbrev-commit

022a76b Merge pull request #3071 from bashtage/appveyor-clone-depth
85ec79c Merge pull request #3072 from bashtage/correct-test-coverage
075415d Merge pull request #3073 from bashtage/test-datasets
#032a9ab Merge pull request #3069 from matthew-brett/relax-statespace-for-32bit
c5b2f2e Merge pull request #3080 from bashtage/update-pypi-info
35ef666 Merge pull request #3084 from bashtage/chelleych-arima_longar_start_para
c914ea5 Merge pull request #3095 from kshedden/phreg_patsy
cb8dd76 Merge pull request #3100 from kshedden/duration_readme
2d95be0 Merge pull request #3088 from bashtage/remove-sourceforge
ba01d87 Merge pull request #3081 from bashtage/jarque-bera-null
#f86c074 Merge pull request #3104 from ChadFulton/gh-3092
760e446 Merge pull request #3077 from bashtage/test-missing-plots
#9ccf83f Merge pull request #3110 from ChadFulton/gh-3108
036ad9f Merge pull request #3120 from N-Wouda/double_key
60bf8a1 Merge pull request #3147 from josef-pkt/test_scipy08_3128
92da1cd Merge pull request #3102 from bashtage/fix-adf-docs
5c6280b Merge pull request #3157 from josef-pkt/bug_ttest_summary_header_3116
a111886 Merge pull request #3158 from josef-pkt/bug_influence_plot_3103
2a94f9f Merge pull request #3160 from josef-pkt/pandas_compat_x13
ef53915 Merge pull request #3149 from josef-pkt/predict_missing_3087
5fa4f8e Merge pull request #3161 from josef-pkt/multinomial_ci2
e0d5807 Merge pull request #3162 from josef-pkt/covtype_hacpanel_groups
3116b58 Merge pull request #3083 from bashtage/jseabold-pandas-deprecations
87fd2d3 Merge pull request #3099 from bashtage/add-eigenvalue-test
1a09acf Merge pull request #3164 from josef-pkt/bug_glm_residworking
82bc044 Merge pull request #3112 from kshedden/phreg_einsum
1c6eff4 Merge pull request #3166 from josef-pkt/bug_discrete_pred_table
1168655 Merge pull request #3146 from josef-pkt/coint_aeg

# already backported ? not under backport label

left out from description list

3157

no reason not to backport:

3072

3120

josef-pkt commented 8 years ago

To do:

https://github.com/statsmodels/statsmodels/labels/backport-0.8.0 includes two "BUG"

plus missing items in docs (eg. coint) and update release notes

mangecoeur commented 7 years ago

Is a final 0.8 release of statmodels coming?

josef-pkt commented 7 years ago

a few more backports for compatibility issues that showed up recently

some issues and PRs are still open

and there are a few more recent bugfixes that could be backported

AFAICS, all other recent tsa fixes have been backported

any DOC fixes to backport? There are a few PRs with typo corrections.

TomAugspurger commented 7 years ago

@josef-pkt is there anything I can do to help with a release? I'm teaching a class in a few weeks, and it'd be nice to have a stable release out.

Maybe for a separate issue, but what all can we do to make the release process less burdensome? What are the big time sinks? Release notes? pandas' requirement for each Pull Request to include a note in the whatsnew file has been successful I think.

ChadFulton commented 7 years ago

I'm also happy to work on the release, but I don't really know all the moving pieces of a release.

I'm not really suggesting this :) but I think that one thing that would make releases easier and less burdensome would be to delete all existing issues and all existing PRs. What I mean is just that there are a lot of issues that refer to ideas from long ago and code that is no longer actively curated by anyone. And since the subject is pretty technical, there's a pretty high bar for "jumping in".

For example, #1301 is prio-high, but was created 3 years ago and set to prio-high over a year ago. It's for the ARMA models which I think aren't really maintained except to put out fires.

(I am of course guilty of creating code that would be tough for someone else to maintain - trying to learn the whole state space code base would require some pretty serious effort, unfortunately - in my defense, Valera was able to at least make use of it last summer ;)

My opinion is that some judicious pruning (possibly even of models / features) could be useful given the manpower constraints. But that is a separate issue from the 0.8 release.

TomAugspurger commented 7 years ago

Most recent first.

hash msg backport
13fae462 Merge pull request #3416 from bashtage/quackdaddy-alpha-fix #3408
10a54287 Merge pull request #3414 from bashtage/high-r2-cointegration #3408
e6d9926a Merge pull request #3413 from bashtage/kde-doc #3408
8fdfd660 Merge pull request #3411 from TomAugspurger/predict-series-frame #3408
4d265c1d Merge pull request #3412 from bashtage/getnewargs-arima #3408
67dcd43f Merge pull request #3348 from bashtage/fix-future-warnings #3408
71da899f Merge pull request #3406 from has2k1/fix-prediction-results-no-wieghts #3408
47b7c9de Merge pull request #3390 from bashtage/fix-python-2.6-errors no (after #3327)
706f4b99 Merge pull request #3370 from lorentzenchr/families_link no
b65da36a DOC: Improve documentation about glm #2959 update3 expectation #3408
6b3e1954 DOC: Improve documentation about glm #2959 update2 Tweedie #3408
ae69ad5b DOC: Improve documentation about glm #2959 update1 #3408
17c1c082 Merge pull request #3364 from lorentzenchr/glm_docu no (splitting)
792a3c5d Merge pull request #3387 from ChadFulton/mswitch-pd-dates #3408
73a36fd9 Merge pull request #3383 from ChadFulton/sarimax-000 no
ec78b871 Merge pull request #3385 from kshedden/idlink #3408
9ab4bcfc Merge pull request #3126 from kshedden/logrank_entry #3408
c6a94add Merge pull request #3382 from sinhrks/timeseries #3408
682b1b58 Merge pull request #3365 from ChadFulton/ucm-init-kwds #3408
abb5996d Merge pull request #3106 from kshedden/ridge #3408
81e0f47c Merge pull request #3179 from kshedden/mice-3177 #3408
22f27667 Merge pull request #3350 from ChadFulton/gh-3349 no
b97c4211 DOC: Improve documentation about glm #2959 #3408
621f37e4 Merge remote-tracking branch 'upstream/master' no (splitting)
42d2f579 Merge pull request #3341 from ChadFulton/gh-3283 #3342
fea7e827 Merge pull request #3333 from josef-pkt/MAINT_numpy_compat #3408
8dff66c9 Merge pull request #3331 from thequackdaddy/global_stringio #3408
7e6b94b5 Merge pull request #3239 from thequackdaddy/stringio #3408
fef71e5a Merge pull request #3327 from yl565/manova_squashed no
45f9e9a4 Merge pull request #3293 from partev/patch-2 #3408
d265150a Merge pull request #3301 from alno/patch-1 #3408
1223724d Merge pull request #3311 from josef-pkt/mnlogit_margin_frame #3408
9ee558b5 Merge pull request #3305 from ChadFulton/gh-3304 no
52f1115f Merge pull request #3272 from ChadFulton/tsa-dates no
57972486 Merge pull request #3292 from partev/patch-1 #3408
e90b71ed Merge pull request #3289 from ChadFulton/gh-3286 no
07c721a8 Merge pull request #3245 from vlas-sokolov/fix-typos #3408
7c723cf1 Merge pull request #3263 from ChadFulton/ss-simulate #3265
c3493d91 Merge pull request #3252 from ChadFulton/ss-int-sarimax #3408
44b5315c Merge pull request #3251 from statsmodels/ss-uc-doc-fix #3408
fc3584f2 Merge pull request #3243 from ChadFulton/ss-doc-fix #3408
c2ff5471 Merge pull request #2845 from ChadFulton/ss-cykfs no
9872ada5 Merge pull request #3184 from fisadev/master #3408
49114895 Merge pull request #3206 from yarikoptic/enh-perms #3408
1a0d7118 Merge pull request #3218 from feeds/master #3408
b6dae4f0 Merge pull request #3141 from ChadFulton/ms-cysm no
03158f2a Merge pull request #3113 from ChadFulton/gh-3068 #3175
f5c3d361 Merge pull request #3205 from ChadFulton/ss-save-res #3408
bb1db2b4 Merge pull request #3140 from ChadFulton/gh-3140 #3174
d867d8a9 Merge pull request #3170 from jvhaggard/jeffreys_typo_fix #3408
11686550 Merge pull request #3146 from josef-pkt/coint_aeg #3171
1c6eff48 Merge pull request #3166 from josef-pkt/bug_discrete_pred_table #3171
82bc0449 Merge pull request #3112 from kshedden/phreg_einsum #3171
1a09acfb Merge pull request #3164 from josef-pkt/bug_glm_residworking #3171
87fd2d3f Merge pull request #3099 from bashtage/add-eigenvalue-test #3171
3116b589 Merge pull request #3083 from bashtage/jseabold-pandas-deprecations #3171
e0d5807b Merge pull request #3162 from josef-pkt/covtype_hacpanel_groups #3171
5fa4f8ed Merge pull request #3161 from josef-pkt/multinomial_ci2 #3171
ef539151 Merge pull request #3149 from josef-pkt/predict_missing_3087 #3171
2a94f9f7 Merge pull request #3160 from josef-pkt/pandas_compat_x13 #3171
a1118869 Merge pull request #3158 from josef-pkt/bug_influence_plot_3103 #3171
5c6280bf Merge pull request #3157 from josef-pkt/bug_ttest_summary_header_3116 #3171
92da1cd3 Merge pull request #3102 from bashtage/fix-adf-docs #3171
60bf8a14 Merge pull request #3147 from josef-pkt/test_scipy08_3128 #3171
036ad9f7 Merge pull request #3120 from N-Wouda/double_key #3171
9ccf83fe Merge pull request #3110 from ChadFulton/gh-3108 #3114
760e446c Merge pull request #3077 from bashtage/test-missing-plots #3171
f86c0742 Merge pull request #3104 from ChadFulton/gh-3092 #3107
ba01d871 Merge pull request #3081 from bashtage/jarque-bera-null #3171
2d95be06 Merge pull request #3088 from bashtage/remove-sourceforge #3171
cb8dd764 Merge pull request #3100 from kshedden/duration_readme #3171
c914ea5e Merge pull request #3095 from kshedden/phreg_patsy #3171
35ef6664 Merge pull request #3084 from bashtage/chelleych-arima_longar_start_params #3171
c5b2f2ed Merge pull request #3080 from bashtage/update-pypi-info #3171
032a9abb Merge pull request #3069 from matthew-brett/relax-statespace-for-32bit #3093
075415d6 Merge pull request #3073 from bashtage/test-datasets #3171
85ec79cf Merge pull request #3072 from bashtage/correct-test-coverage #3171
022a76bb Merge pull request #3071 from bashtage/appveyor-clone-depth #3171
9527b0e6 Merge pull request #3074 from josef-pkt/forward_08 no
josef-pkt commented 7 years ago

Sorry, I didn't have time today to look at it.

I think at this stage we should only backport the missing compatibility fixes and bugfixes. I don't think we should start with adding refactorings to the 0.8 branch. The main reason is that statsmodels 0.8 went through Debian testing, and their version differs by just a few extra compatibility fixes from our maintenance/0.8. If we add many changes I guess we would need more Debian testing again or have diverging 0.8 versions.

Chad's statespace refactoring in master went (surprisingly) smooth, and I haven't seen any problems yet. But it's still quite a large change compared to current 0.8.

3370 from lorentzenchr/families_link is a good refactoring going forward but not necessary and adds an additional deprecation that is cheap to delay.

3327 from yl565/manova_squashed is a new feature for 0.9, too new to add between rc and release

most of the other ones outside tsa seem ok to backport (but I have not checked all of them) I'm not sure what should be backported in tsa.

TomAugspurger commented 7 years ago

I think at this stage we should only backport the missing compatibility fixes and bugfixes. I don't think we should start with adding refactorings to the 0.8 branch.

Makes sense. If this goes relatively smoothly, I can help with a 0.8.1 release in a month or two. I'll update my table now to be more conservative on what is backported (feel free to edit it directly if you want, or copy it to the original post).

TomAugspurger commented 7 years ago

Date-handling-specific refactors:

@ChadFulton can you confirm if #3387 should be backported? Is it independent of #3272? Is there anything else that shouldn't be backported?

josef-pkt commented 7 years ago

new merge for backporting

3406 71da899ffdafa672d15cb99abd5ae2fdffe3bada (bug in basic usecase of get_prediction)

TomAugspurger commented 7 years ago

@josef-pkt thoughts on #3311? It's not labeled for backport.

Edit: Ah, I see you've mentioned it up above in this thread.

ChadFulton commented 7 years ago

For my PRs, here's how I see them. I will plan to backport all of them this weekend (except for the ss-cykfs and tsa-dates ones, which I think we're saving for 0.9).

Should be backported

Can be backported

Should not be backported

josef-pkt commented 7 years ago

Yes, #3311 fixes a bug introduced in 0.8rc1, and fixes what most likely never worked

TomAugspurger commented 7 years ago

Yes, #3311 fixes a bug introduced in 0.8rc1, and fixes what most likely never worked

Thanks

I will plan to backport all of them this weekend (except for the ss-cykfs and tsa-dates ones, which I think we're saving for 0.9).

OK, I'll remove statespace from of https://github.com/statsmodels/statsmodels/pull/3408

And @ChadFulton, just a typo but

3273 (tsa-dates: date refactoring)

should be #3272

ChadFulton commented 7 years ago

I will have a fair amount of time this weekend to work on things. At least, I will:

Anything else?

I am also happy to review other tsa-related issues and try and resolve them. I will look over the "is:open is:issue label:comp-tsa label:type-bug" non-0.9 issues myself, but I sometimes have a hard time interpreting where they stand (i.e. many of these seem partially / fully resolved), so if anyone knows of particular issues they'd like me to make an attempt at, just let me know about the issue number

josef-pkt commented 7 years ago

@ChadFulton If your "Should be backported" are ok, then Tom can cherrypick them in sequence together with the other backports.

I would leave the "Can be backported" for 0.9, mainly to avoid getting additional cython code through debian testing. When I checked a few weeks ago, they were close to finishing up with the next release. (Unrelated to tsa, it looks like coordinating pandas, statsmodels and seaborn bugs and problems takes a bit of effort because of the close integration.)

TomAugspurger commented 7 years ago

And @ChadFulton, I'm happy to cherrypick your "Should be backported" as part of #3408. Just let me know.

ChadFulton commented 7 years ago

And @ChadFulton, I'm happy to cherrypick your "Should be backported" as part of #3408. Just let me know.

Yes, thanks!

ChadFulton commented 7 years ago

I would leave the "Can be backported" for 0.9, mainly to avoid getting additional cython code through debian testing.

That sounds fine to me, I'd intended them to be part of 0.9

ChadFulton commented 7 years ago

And @ChadFulton, just a typo but

3273 (tsa-dates: date refactoring)

should be #3272

Thanks!

TomAugspurger commented 7 years ago

I updated https://github.com/statsmodels/statsmodels/issues/3075#issuecomment-276572146

Going through now, but we should have every merge commit accounted for (either included in #3408 or not being backported). I'm going to verify this.

Going forward, how are we tracking what needs to be backported? We have

currently open and labeled for backport. All of them look close to being done, but none are release blockers I think.

It'd be nice to track PRs merged to master, but not yet backported somewhere (either a label or a task list at the top of this post). The backport label doesn't quite do this, since it has PRs that have been backported as well.

ChadFulton commented 7 years ago

I'll finish, merge, and backport #3405 by this weekend.

josef-pkt commented 7 years ago

I think both #3348 and #3217 should be backported. I didn't try to understand the details for #3348, but it should future-proof statsmodels a bit (future warnings from pandas and numpy. Otherwise, we might need a 0.8.1 for some compatibility fixes) I merged it

3217 fixes ARIMA pickling which currently doesn't work (merging cannot break what doesn't work)

needs a rebase before merging (too far behind master)

bashtage commented 7 years ago

Is there a definitive list of things to be fixed so that a release can (finally) be made? 0.6.1 is really long in the tooth.

josef-pkt commented 7 years ago

I'm looking at prio-high and backport labels. Some of those are wishful thinking and can be delayed.

3409 is a new blocker that needs better pandas index handling (I guess reindex is wrong)

3344 / #3346 is old behavior, but a type-bug-wrong (meaning silently returns wrong numbers, ignores keyword)

3405 Chad said we will finish this

3332 can be closed if there are no more numpy and pandas warnings left

3118 is just a doc fix

3182 is a regression compared to 0.6 in not very common usage, but refactoring predict seems to be difficult and created problems in the pandas - patsy - statsmodels interaction.

and I removed one backport label

josef-pkt commented 7 years ago

3355 just adding a term back in, not so important but a regression compared to 0.6

3190 is another type-bug-wrong, but in a corner case that could be ignored for 0.8 (I didn't look at details for a fix yet)

all (or maybe almost all) other 22 prio-high issues are not urgent. the label is mostly a reminder that we shouldn't wait forever to resolve them.

TomAugspurger commented 7 years ago

So now that #3408 is in, where do we stand?

  1. Merge to master, then backport
  1. Merge directly to 0.8.x

Any reason not to cut a release today or tomorrow, assuming those get in? I'm able to do the final doc build and the release on conda-forge.

For PyPI, who all is a member in https://github.com/MacPython/statsmodels-wheels? We'll need to update the version of multibuild there to get wheels for 3.6. @matthew-brett are you still following this thread?

ChadFulton commented 7 years ago

I notice that none of our CI builds are testing against numpy 0.12.0, which seems to produce a number of new warning messages, at least in statespace. These just seem to be related to somewhat sloppy unit test writing. So two questions:

  1. Should I fix those warnings prior to the release? (Maybe just silencing the warnings, there's nothing wrong with the tests themselves)
  2. Should we make sure to do some full CI runs with numpy 1.12.0 prior to release?
josef-pkt commented 7 years ago

I thought we use 1.12. now, but python 3.5 uses numpy 1.10 The way it is setup the last two python, 2.7 and 3.5, on travis use the latest release available through conda. (I thought I had seen 1.12 in the past, but that doesn't look right, spot checking some older Travis runs, it is the same 1.10 for python 3.5 and 1.11 for python 2.7) So, question is why travis or conda doesn't use numpy 1.12

Debian testing for next release had used numpy 1.12rc or dev.

statsmodels 0.8 should run properly/clean with numpy 1.12, also so that Debian can update their version without additional problems, if they do.

bashtage commented 7 years ago

Travis doesn't use 1.12 since anaconda hasn't released a numpy 1.12 package.

I've tested against 1.12. Nothing big enough to warrant holding off. Really need to update Travis. Many of the setups are so old as to be useless.

On Sat, Feb 4, 2017, 16:06 Josef Perktold notifications@github.com wrote:

I thought we use 1.12. now, but python 3.5 uses numpy 1.10 The way it is setup the last two python, 2.7 and 3.5, on travis use the latest release available through conda. (I thought I had seen 1.12 in the past, but that doesn't look right, spot checking some older Travis runs, it is the same 1.10 for python 3.5 and 1.11 for python 2.7) So, question is why travis or conda doesn't use numpy 1.12

Debian testing for next release had used numpy 1.12rc or dev.

statsmodels 0.8 should run properly/clean with numpy 1.12, also so that Debian can update their version without additional problems, if they do.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/statsmodels/statsmodels/issues/3075#issuecomment-277455422, or mute the thread https://github.com/notifications/unsubscribe-auth/AFU5RSMHcKwAomKWv8XnUUr1Zm1nywhrks5rZKHwgaJpZM4I9win .

ChadFulton commented 7 years ago

My two issues (#3405 + backport #3422 and #3421) have been fixed and merged.

bashtage commented 7 years ago

3424 contains a travis build that uses NumPy 1.12

josef-pkt commented 7 years ago

@ChadFulton Can you check whether any of your backports (since 0.8rc1) need to be added to the release\version0.8.rst or whether they are all covered with the general description for adding statespace models?

I'm going through the other backports, and expect to add a few items to the list.

The contributor list based on the git log will also need to be updated to take account of new contributors since 0.8rc1

AFAICS, we have 3 more backports #3425 #3426 and #3429

ChadFulton commented 7 years ago

@ChadFulton Can you check whether any of your backports (since 0.8rc1) need to be added to the release\version0.8.rst or whether they are all covered with the general description for adding statespace models?

These were all bug fixes and are covered by the general description.

ChadFulton commented 7 years ago

I'm all done with 0.8 as far as I know.

josef-pkt commented 7 years ago

updated list to backport 3425, 3426, 3429, 3432, 3435 3429, 3351 not yet in master 3441 backport with correction? 3351 (installation instructions)

That's the end AFAICS.

TomAugspurger commented 7 years ago

https://github.com/MacPython/statsmodels-wheels/pull/1 should have all the kinks worked out for linux and mac wheels. I'll just push an additional commit once the release is tagged.

josef-pkt commented 7 years ago

@TomAugspurger very good, Are there any more patches for macpython that need to go into maintenance?

An sdist worked also for me without problems with pip install on windows.

There might be some doc build issue remaining.

TomAugspurger commented 7 years ago

Are there any more patches for macpython that need to go into maintenance?

I skipped the tests failing due to https://github.com/statsmodels/statsmodels/pull/3402 (pickling statespace models on python 3.6). Not worth holding up the release IMO.

bashtage commented 7 years ago

No, and not able to merge I think without lots of surgery. Pickling has always been broken for those classes -- only old Python didn't tell you and just returned then class name with no state.

TomAugspurger commented 7 years ago

Unless I'm missing anything, all the outstanding issues have been resolved. And we can make additional documentation fixes after tagging the release.

bashtage commented 7 years ago

The two backports are left then I think it is ready, assuming OSX build is working

bashtage commented 7 years ago

Message is like

internal padding after c:\git\statsmodels\docs\source\gettingstarted.rst:11: WARNING: image file not readable: ..\build\build\html_static\gettingstarted_0.png

And gettingstarted_0.png is in docs\source.

Ease to manually move, but not sure why these are output in the wrong place.

TomAugspurger commented 7 years ago

Docs with the current maintenance/0.8.x are at http://www.statsmodels.org/dev/

josef-pkt commented 7 years ago

@bashtage @TomAugspurger Based on what you did yesterday, #3345 is the last missing piece, is it?

I can backport it and then tag and push to pypi. I haven't tried the microsoft compiler build yet, but that could wait a day to get on pypi.

TomAugspurger commented 7 years ago

Yep, I believe so. Once it's on Github I can do the Mac and Linux wheels, and once it's on PyPI I can do the conda-forge build.

bashtage commented 7 years ago

I have found quite a few more doc issues, but I don't think these should hold things up.

bashtage commented 7 years ago

One remaining issue -- there are promised deprecations that haven't been removed, e.g. jac method.