ramanshahdatascience / tshirts

The Bayesian t-shirts: a taste of optimal inventory
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Hierarchical model #5

Open ramanshah opened 2 years ago

ramanshah commented 2 years ago

For the third box of t-shirts, it's becoming clear that the Dirichlet prior isn't really adequate to come up with the best orders. From a state like this:

  MXS MS MM ML MXL M2XL M3XL WS WM WL WXL W2XL totals
Lifetime received 1 4 19 13 4 3 1 3 9 4 6 3 70
Lifetime queued 0 1 19 12 2 0 0 3 8 1 2 1 49
Lifetime shipped 0 1 18 11 2 0 0 3 8 1 2 1 47
                           
Actual inventory (received - shipped) 1 3 1 2 2 3 1 0 1 3 4 2 23
Inventory once caught up (received - queued) 1 3 0 1 2 3 1 0 1 3 4 2 21

We're getting an order like this:

Optimal order:
MXS : 0
MS  : 0
MM  : 13
ML  : 8
MXL : 2
M2XL: 0
M3XL: 0
WS  : 3
WM  : 6
WL  : 1
WXL : 1
W2XL: 1

Note how the Dirichlet prior is resulting in requests to buy more L, XL, and 2XL women's shirts even though in all cases I already have >=2x as many of each size as has ever been requested. This seems clearly to be waste: the problem is that the model can't use the knowledge that women are generally under-represented to reduce the expected number of orders for any given women's size.

Instead, we want to estimate a hierarchical structure, pooling knowledge of the gender ratio of orders to improve the posterior frequencies of a given gendered size. This would require upgrading from trivial closed-form Dirichlet math to (likely PyMC3) MCMC methods for the hierarchical likelihood.