alteryx / evalml

EvalML is an AutoML library written in python.
https://evalml.alteryx.com
BSD 3-Clause "New" or "Revised" License
757 stars 85 forks source link

time_budget parameter might not be strict enough for harder problems #3827

Open iXanthos opened 1 year ago

iXanthos commented 1 year ago

Hello,

I have been using EvalML for a while and now I tried running analyses on more complex data, while setting a small time_budget value to observe how it behaves.

evaml version: 0.59.0

Analysis settings: Default, with variant time_budget (given below)

  1. gisette Dimensions: (7000, 5001) Time elapsed: 23:51
    Models Trained: naive LR time_budget: 60 (seconds)

  2. QSAR-TID-11109 Dimensions: (1976, 1025) Time elapsed: 01:22 Models Trained: naive LR RF time_budget: 60 (seconds)

  3. covertype Dimensions: (581012, 55) Time elapsed: 01:19 Models Trained: naive LR time_budget: 60 (seconds)

  4. AP_Colon_Kidney Dimensions: (546, 10936) Time elapsed: 35:45 Models Trained: naive LR time_budget: 120 (seconds)

It is evident in cases 1 and 4 that Evalml created the naive model and then procceeded with the creation of the linear one. The problem is the Time elapsed is much bigger than the time_budget. My questions are:

From my understanding, once a model starts training (within the time_budget), it cannot stop. Still, this could pose an issue in tight time_budget scenarios. Moreover, the user might be baffled as to why the analysis is not ending.

Regards, Iordanis Xanthopoulos

jeremyliweishih commented 1 year ago

Hi @iXanthos!

Yes that is current expected behavior. AutoMLSearch will finish training the current pipeline it is on before evaluating whether to stop search or not. In your cases search probably finished the naive model and started the first additional pipeline before max_time has elapsed and only stopped search once the additional pipeline failed. In these cases the dataset is large enough such that training and validating the additional pipeline takes longer than max_time.

To change this behavior we would need to continuously verify against the time elapsed and then kill the computations once we do hit the elapsed time. We can set this new behavior under a flag to keep existing behavior as well.

Is this a feature you're interested in working on? If not we can try prioritizing it against our existing issues. Thanks!