Decide on a API philosophy for V2.0

ourownstory commented 2 years ago

Currently, we follow the classic setup of having one long list of args to initialize the model. On top, we have some functions to configure special modules later, like add_lagged_regressor.

e.g.

from neuralprophet import NeuralProphet
m = NeuralProphet(
    learing_rate = 0.1,
    n_changepoints = 20,
    ...
)
m = m.add_lagged_regressor("temp", ...)

However, the list of args to the model has become long, and having some funtions set up modules post init is inconsistent. Thus, I suggest we rethink this.

Here are two ideas of how and a half ideas how to change this:

1 Structured with configuration classes

This makes the setup less complicated but more complex, needing to know of additional configuration classes.


from neuralprophet import NeuralProphet
from neuralprophet.configure import *

m = NeuralProphet(
    train = Train(
        learing_rate = 0.1,
    ),
    trend = Trend(
        n_changepoints = 20,
    ),
    lagged_regressor = LaggedRegressor(
        variables = [
            LaggedRegressorVariable(name="temp"),
        ],
        regularization = 0.01,
    ),
    ...
)

To make it backwards compatible, we could have a helper function that maps old arguments to the new format for compatibility reasons. It would be challenging to include the post-init configuration methods in the compatibility layer.

from neuralprophet import NeuralProphet, convert_config

m = NeuralProphet(
    **convert_config(
        learing_rate = 0.1,
    )
)

2 Grouped flat arguments

While we use the approach 1 to organize configs internally, we have a flat user-facing list of args. We simply add a prefix to each arg, according to where it belongs, e.g. train_learning_rate. This is less complex, but more complicated and verbose. It will further increase the length of class args. However, it will help group and find relevant args.

m = NeuralProphet(
    train_learning_rate = 0.1,
    trend_n_changepoints = 20,
    lagged_regressor_names = ["temp", ]
    lagged_regressor_regularization = 0.01,
    ...
)

3 Hybrid of structured and flat arguments

We could support both flat args and and structured config object to be passed to the class. This allows for flexibility on the user side and a less bumpy transition to the structured format, but it's the most complex of all option. Hereby, the flat args could be

a) identical to today (max compatibility)
b) same as option 2 (truly hybrid)
c) only the most important and most frequently used args.

This would allow for:


from neuralprophet import NeuralProphet
from neuralprophet.configure import *

m = NeuralProphet(
    train_learning_rate = 0.1,
    trend_n_changepoints = 20,
    lagged_regressor = LaggedRegressor(
        variables = [
            LaggedRegressorVariable(name="temp"),
        ],
        regularization = 0.01,
    ),
    ...
)

and

m = NeuralProphet(
    train = Train(
        learing_rate = 0.1,
    ),
    trend = Trend(
        n_changepoints = 20,
    ),
    lagged_regressor = LaggedRegressor(
        variables = [
            LaggedRegressorVariable(name="temp"),
        ],
        regularization = 0.01,
    ),
    ...
)

and without a complex configuration, it could still look this clean.

m = NeuralProphet(
    train_learning_rate = 0.1,
    trend_n_changepoints = 20,
    ...
)

Thoughts

I would suggest considering Option 1 or 3c, as it allows us to slowly deprecate and move users to the new interface, while still allowing for a begginer-friendly basic model configuration via flat args as well as allowing power users to configure the model in the exact way as the real configuration works under the hood.

What are your thoughts?

noxan commented 2 years ago

Thanks for structuring the earlier brainstorming so well @ourownstory 👏

Feedback on (2)

Small point of feedback on the lagged regressors: it is currently possible to define regularization for each lagged regressor individually, which could make (2) a bit clunky, unless I misunderstood the syntax, e.g.

lagged_regressor_names = ["temp1", "temp2", "temp3"]
lagged_regressor_regularization = [0.01, 0.2, 0.005],
lagged_regressor_n_lags = [3, None, "scalar"]

My humble opinion on overall design

I'm a strong advocate for a single way to do things: having strong defaults and an opinionated approach makes sure people have one way of doing things and get things done instead of thinking which might be the best way (especially if the choices are very similar) and makes it easy to search for problems and solutions online (they are all compatible and consistent)

Another idea: (4) Configuration attributes (and methods)

We could move the configuration from the constructor to attributes or methods. It's not really the way a lot of the machine learning projects work, so more to have the idea included for completeness and discussion:

Variant 1 (4.1) - Nested attributes

m = NeuralProphet()
m.training.learning_rate = 0.02
m.trend.n_changepoints = 5
m.add_lagged_regressor("temp1", regularization=0.01)

Variant 2 (4.2) - Flat attributes

m = NeuralProphet()
m.learning_rate = 0.02
m.n_changepoints = 5
m.add_lagged_regressor("temp1", regularization=0.01)

karl-richter commented 1 year ago

@ourownstory @noxan Interesting approaches, I really like the idea of rethinking the UI!

For 2/3c, I like the flatted args for model components, but I feel like for the training args (eg. batch_size/epochs) its more confusing than that it helps (in the ML world these args are so standard I would consider it bad practice to deviate from that). One idea (where I am not sure whether its good/bad) could be to move the training args to the .fit() function since thats technically the only place where they are needed.

I think long term version 1/3 would be the most user friendly and allow us to outsource the logic of each model component into a seperate object. I feel like that would be the perfect UI for when we transition to a more modular NeuralProphet. We could even consider slightly abstracting the model building from the forecaster definition. I adjusted your example to adhere to what brainstormed during my research and shares some ideas with how PyTorch handles modularity.

model = SequentialModel(
    Trend(
          n_changepoints = 20,
    ),
    LaggedRegressor(
        variables = [
            LaggedRegressorVariable(name="temp"),
        ],
        regularization = 0.01,
    )
)
NeuralProphet(model=model)

To transition to an architecture similar to 1/3, I would suggest combining this with the research I am currently conducting on modularity since this will anyways bring the need for similar changes to the UI. To clean it up short term I would vote for version 2 but without the training_ prefix.

LeonieFreisinger commented 1 year ago

@ourownstory @noxan @karl-richter I agree, it´s a great idea to restructure the hyperparameter UI. I like the feedback you all brought in.

Let me quickly summarize and touch upon the feedback:

flexibility vs. one clear approach - I think Richard makes a valid point of facilitating problem/ solution search for users, if there is one clear approach.
only flat args (opt2) - Richard brought up the point that the args might get clunky the more they are individualized (e.g. with individual lagged regressor reg) and Karl mentioned the common practice of having the training args without a prefix. Besides those two points, (flat) args are definitely a lean way to go. From this, I think it can be concluded that a hybrid approach would be good. However, in my eyes
train args in fit() I - think it´s worth discussing. However, it slightly blows up the input args to fit. From what I have seen at darts and gluonts, they both pass the train args to the model(), probably to keep the fit() lean.
hybrid approach (opt3) - I think a good compromise would be a sub-option of option 3, where the train args are used without a prefix as single args and for each other arg a configuration class is used. (To pick-up bullet point 1: would not allow the user to flexibily choose between two input formats per arg). In case, there are further args. In this case, we would need to still discuss how to handle "stand-alone" args, where it´s not worth creating a configuration class. Will it be confusing for the user to also input them as a single arg without prefix?

LeonieFreisinger commented 1 year ago

@ourownstory @noxan @karl-richter I agree, it´s a great idea to restructure the hyperparameter UI. I like the feedback you all brought in.

Let me quickly summarize and touch upon the feedback:

flexibility vs. one clear approach - I think Richard makes a valid point of facilitating problem/ solution search for users, if there is one clear approach.

only flat args (opt2) - Richard brought up the point that the args might get clunky the more they are individualized (e.g. with individual lagged regressor reg) and Karl mentioned the common practice of having the training args without a prefix. Besides those two points, (flat) args are definitely a lean way to go. From this, I think it can be concluded that a hybrid approach would be good. However, in my eyes

train args in fit() I - think it´s worth discussing. However, it slightly blows up the input args to fit. From what I have seen at darts and gluonts, they both pass the train args to the model(), probably to keep the fit() lean.

hybrid approach (opt3) - I think a good compromise would be a sub-option of option 3, where the train args are used without a prefix as single args and for each other arg a configuration class is used. (To pick-up bullet point 1: would not allow the user to flexibily choose between two input formats per arg). In case, there are further args. In this case, we would need to still discuss how to handle "stand-alone" args, where it´s not worth creating a configuration class. Will it be confusing for the user to also input them as a single arg without prefix?

As a reference:

karl-richter commented 1 year ago

Discussion thoughts

m = NeuralProphet()

m = NeuralProphet()
m.add(
     LaggedRegressor()
)

m = NeuralProphet(components=[
     Trend(),
     Seasonality(),
])
m.add(
     LaggedRegressor()
)

All functions

m = NeuralProphet()
m.add_trend(n_changepoints=5)

ourownstory / neural_prophet