Ownership: Prompt and dataset is from stratascratch.com, solution notebook & feature engineering scripts are my own work.
When a consumer places an order on DoorDash, we show the expected time of delivery. It is very important for DoorDash to get this right, as it has a big impact on consumer experience. In this exercise, you will build a model to predict the estimated time taken for a delivery.
Concretely, for a given delivery you must predict the total delivery duration seconds , i.e., the time taken from:
Start: the time consumer submits the order (created_at)
to
End: when the order will be delivered to the consumer (actual_delivery_time)
In addition to the system-derived data there are two values produced by other ML models for each order:
estimated_store_to_consumer_driving_duration
estimated_order_place_duration
The best model I have built so far uses a two-step ensemble approach:
Using this two-step, ensemble approach the best scores I have produced so far are as follows:
Although DoorDash uses RMSE to score this exercise, the MAE and RMSE-to-y_true-standard-deviation ratio provide more context:
To provide a benchmark for performance I found two other notebooks that work through this prompt and dataset:
created_at
and/or actual_delivery_time
valuescreated_at
was later than/less than the actual_delivery_time
value, or vice-versaSummary of feature engineering:
actual_total_delivery_duration
) and deriving from that the amount of time an order spends in-storeTop ten features by importance for the two-step (prep. time pred >>> delivery time prediction) model:
Feature | Score |
---|---|
pred_order_prep_time | 0.314096 |
est_time_non-prep | 0.050941 |
estimated_store_to_consumer_driving_duration | 0.022149 |
market_id__4.0 | 0.012894 |
onshift_to_outstanding | 0.012503 |
clean_store_primary_category__dessert | 0.011970 |
total_items | 0.010003 |
hour_mean_total_onshift_dashers | 0.009541 |
estimated_order_place_duration | 0.009181 |
clean_store_primary_category__american | 0.008342 |
For comparison, these are the top ten features for a single model approach:
Feature | Score |
---|---|
hour_mean_total_outstanding_orders | 0.242148 |
est_time_non-prep | 0.116266 |
onshift_to_outstanding | 0.070525 |
hour_busy_outs_avg | 0.032786 |
hour_mean_total_onshift_dashers | 0.049111 |
market_day_mean_total_outstanding_orders | 0.032590 |
store_day_of_week_est_time_prep_per_item_mean | 0.022351 |
busy_to_outstanding | 0.020434 |
orders_without_dashers | 0.019857 |
created_day_mean_total_outstanding_orders | 0.016961 |
Please note that this section of the project needs more attention and development. Two popular dimensionality reduction methods were considered for this project:
Recall that dimensionality reduction has two general benefits: model accuracy and compute performance. The 'reduced' models were outperformed in terms of accuracy by the 'unreduced' model, with the 'top features' approach beating out PCA.