sktime / sktime-dl

DEPRECATED, now in sktime - companion package for deep learning based on TensorFlow
BSD 3-Clause "New" or "Revised" License
595 stars 79 forks source link

April sitrep #47

Closed james-large closed 9 months ago

james-large commented 4 years ago

Hello all, hope everyone's happy and healthy during these times. In general, I think we're all aware that world and life events put projects like this on the backburner. This is just a faux issue to get back up to speed a bit on the project

The general short/mid term goal was (and still is) to get a master push out and pypi package updated with all the basic, solid, and compatible functionality we've added/are adding. That way, the project serves as a base of good core functionality that we can just implement new algs/networks into, and update as needed.

Very broadly, things we still want in:

Anything major I've missed you think?

mloning commented 4 years ago

Hi all, regarding the refactoring of base sktime, I hope to be done with that at the end of May at the latest, it will lead to a new release (v0.4.0), here's a short summary of what it includes:

PR: https://github.com/alan-turing-institute/sktime/pull/246

james-large commented 4 years ago

Nice, sounds good. We could aim to restore parity and be ready for a master push somewhere around the end of May or first weeks of June then. If we get a massive kick in the arse and get a lot done in the next week or two, we can split it into two parts, 0.2.0 = the big functionality update (0.1.0 is distant memory compared to current dev), 0.2.1 = restoring parity to next version of base sktime

Withington commented 4 years ago

Great, I’m glad we are back on it.

Ease of installation: Docker – agreed, the TensorFlow Docker image only supports GPU on Linux (as per their comment here https://www.tensorflow.org/install/gpu). I’ve updated the README with an instruction on how to run our Dockerfile in CPU only mode. #35

Functionality: Networks #46 adds MCDNN regression, leaving only MCC and TWIESN without regression. My vote would be to leave these on the todo list but go ahead with release 0.2.0 without them.

james-large commented 4 years ago

1) Yeah the Tensorflow-provided setups are for linux only and work fine, there were people around saying you can get it working on windows too though. Provided a nice distraction for a while until it turned into annoyance.

2) Agreed leave on the todo in my opinion. On a pragmatic level, they are not the strongest networks on average, from the point of view of a user just wanting to benchmark the strongest approaches on their data, MCNN and TWIESN liekly aren't the ones to be using anyway

mloning commented 4 years ago

So, there are two remaining PRs now because we can go for a new release.

Refactoring in sktime has also moved ahead and should be ready for a new release at the end of the month. Shall we make a separate release with sktime-dl before that or check integration with sktime's new release changes first?

Withington commented 4 years ago

We should probably wait for sktime release. Shall we create a release candidate branch soon, to avoid adding too much more to the release? Proposed set of branches:

Withington commented 4 years ago

Are there any issues that need to be addressed for v0.2.0? I have some time to work on it for the next few days.

mloning commented 4 years ago

I think we just need to wait for sktime now, but if you like, you could work on some of the easier open issues (e.g. consistent handling of random states #53 or investigating the sporadically failing forecasting tests #56)

mloning commented 4 years ago

So finally, sktime v0.4.0 is out. I have a few other things to look into first, but we should be able to make any final changes to ensure we're compatible and then publish a new release for sktime-dl.

Withington commented 4 years ago

good going @mloning ! 🥇

mloning commented 4 years ago

FYI: https://mcfly.readthedocs.io/en/latest/ deep learning for time series classification with automated hyper-parameter selection

Withington commented 4 years ago

Hi team, I hope you are all well. I'm keen to get dev promoted to master, especially since jmrichardson is starting to use dev. What needs to be done? Is it just a case of switching dev to use sktime v0.4.1 and testing? If so, I can do that.

mloning commented 4 years ago

Hi @Withington, I agree! Ensuring everything still works with sktime 0.4.1 will be the main thing. We should also be able to replace the forecasting wrapper with sktime's forecasting wrappers in sktime.forecasting.compose

james-large commented 4 years ago

Hi @mloning @Withington

Surviving to some extent, pardon the deja vu on this issue. Hope you're both well also

I've tidied one of the branches I started a while back and made the PR for that, as I type it looks like the linting failed, will fix in a sec.

@ABostrom did a PR (#59) that hopefully caught most of the changes for base sktime. If we trigger another run of the tests for sktime-dl dev and they pass, are we generally happy to take that as full compatibility? I believe we may want/need to update the minimum version to 0.4.0 or 0.4.1, it is currently >=0.3.0 (https://github.com/sktime/sktime-dl/blob/dev/setup.py#L53). Don't think 0.3.0 is compatible with -dl any more.

On versioning, tensorflow 2.3.0 came out a week or two ago. It seems to have not brought in any new conflicts and be used with -dl as is, miraculously. We could include another test env for this potentially, or change the 2.1 test to 2.3. 2.1 is where the changes to tensorflow-addons happened, so there may be merit in explicitly having 2.1 (https://github.com/sktime/sktime-dl/blob/dev/.travis.yml#L27).

In terms of additional functionality for 0.2.0, I'd be very happy to shelve too much more and prioritise the dev->master push. 0.2.0 is a completely different beast to 0.1.0 already, waiting further and adding an extra <5% difference seems unnecessary. There's always 0.2.1 or 0.3.

What should be done though, as @mloning said, is replacing the forecasting via regression tests.

I have intentions of implementing a wrapper for passing data in as a generator to get around a gpu memory issue I'm having with some data. This is something that would make using a lot of networks easier on non-tiny data for the user dipping their toes into dl and gpus without expensive hardware. I'd be happy leaving it out if we want to do a push asap (today/tomorrow).

Withington commented 4 years ago

Great to see that sktime-dl is moving on :-)

On "are we generally happy to take that as full compatibility" - I'll also run the notebook examples when I get on to this tomorrow.

Also, the forecasting test was removed in #59 so I'd like to understand where we are with that (see question raised in that PR). Is that the same as @mloning point on "replace the forecasting wrapper with sktime's forecasting wrappers"? Is anyone already looking at that?

It's worth keeping a test env for tensorflow 2.1 while users are likely to still be using it.

On your data generator work - I'm happy to leave that out of this release, in the interest of getting sktime-dl 0.2.0 past the finishing post.

ABostrom commented 4 years ago

Yeah. Deliberately left the forecaster tests out of that PR, as I couldn't find the analogous functionality from sktime anymore. But @mloning may know more etc.

fkiraly commented 9 months ago

obsolete planning issue from 2020