Data4DM / BayesSD

Data for Decision, Affordable Analytics for All
8 stars 0 forks source link

Different `estimated parameter` value is used for stock initial value and ode integration #56

Closed hyunjimoon closed 5 months ago

hyunjimoon commented 1 year ago

@tomfid and @Dashadower I need your help! From stan draws2data code below, inventory_adjustment_time = 7 and minimum_order_processing_time = 5 are used to compute the initial values of stocks. These are inputs read from vensim.

    // Initial ODE values
    real inventory__init = 5 + 2 * 100;
    real expected_order_rate__init = 100;
    real work_in_process_inventory__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7);
    real production_rate_stocked__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 6;
    real production_start_rate_stocked__init = fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) + 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) - 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 3;
    real backlog__init = 100 * 2;

However, for ode integration, values generated from user defined prior distribution normal_rng(2, 0.1), normal_rng(0.05, 0.001) is used which is orders of magnitude smaller than 7 and 5.

    real inventory_adjustment_time = normal_rng(2, 0.1);
    real minimum_order_processing_time = normal_rng(0.05, 0.001);
    real m_noise_scale = normal_rng(0.01, 0.0005);

I think this is a serious problem as I've seen different dynamics panning out from minute change of initial values (tipping points? #14). So the questions is, how to sync the two versions of modeler: one initializing parameter values from vensim, the other initializing parameter distribution from stanify. Below is the entire draws2data code.

generated quantities{
    real inventory_adjustment_time = normal_rng(2, 0.1);
    real minimum_order_processing_time = normal_rng(0.05, 0.001);
    real m_noise_scale = normal_rng(0.01, 0.0005);

    // Initial ODE values
    real inventory__init = 5 + 2 * 100;
    real expected_order_rate__init = 100;
    real work_in_process_inventory__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7);
    real production_rate_stocked__init = 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 6;
    real production_start_rate_stocked__init = fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) + 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) - 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 3;
    real backlog__init = 100 * 2;

    vector[6] initial_outcome;  // Initial ODE state vector
    initial_outcome[1] = inventory__init;
    initial_outcome[2] = expected_order_rate__init;
    initial_outcome[3] = work_in_process_inventory__init;
    initial_outcome[4] = production_rate_stocked__init;
    initial_outcome[5] = production_start_rate_stocked__init;
    initial_outcome[6] = backlog__init;

    vector[6] integrated_result[n_t] = ode_rk45(vensim_ode_func, initial_outcome, initial_time, times, minimum_order_processing_time, inventory_adjustment_time);
    array[n_t] real inventory = integrated_result[:, 1];
    array[n_t] real expected_order_rate = integrated_result[:, 2];
    array[n_t] real work_in_process_inventory = integrated_result[:, 3];
    array[n_t] real production_rate_stocked = integrated_result[:, 4];
    array[n_t] real production_start_rate_stocked = integrated_result[:, 5];
    array[n_t] real backlog = integrated_result[:, 6];

    vector[20] production_rate_stocked_obs = to_vector(normal_rng(production_rate_stocked, m_noise_scale));
    vector[20] production_start_rate_stocked_obs = to_vector(normal_rng(production_start_rate_stocked, m_noise_scale));
}
Dashadower commented 1 year ago

Hi.

I would say there's no definitive answer. As you said, the prior is dependent on the user, and there's no generally applicable way to find a prior that best 'fits' the initial values specified in the original model(something trivial like centering a normal distribution might work for some cases). Even then, you're probably going to have to re-calibrate the prior to alleviate computational issues.

So unless you also fix the variables to the values defined within the original model, specifying a prior is a different problem, effectively creating a different model that models the error involved with the measurements of said variables. I don't think the notion of 'syncing' would work here since there's no corresponding component within the Vensim model.

tomfid commented 1 year ago

"different dynamics panning out from minute change of initial values" seems like a red flag here. The model is basically linear, except for a few features like the nonnegative order constraint and the no inventory=no shipments constraint. I don't think there are any positive loops, so there shouldn't be any tipping points or other sources of extreme sensitivity to initial conditions.

Can you plot the time series of the solution? Seems like there might be some kind of integration instability or something like that. MCMC seems like an unlikely candidate. Another possibility is that it's simply the expected response to stochastic customer orders, but I don't see that in the code. Is the input a step, or random with seed variation across simulations?

tomfid commented 1 year ago

In a real supply chain, I think the user is always going to have a decent idea what the throughput of the system is, and some constraints (warehouse capacity upper limits for example), which would be a candidate for informative priors. They might have less sense about the variance, and less-observable quantities like backlog. If these things aren't obvious to them, then maybe you have your problem diagnosis: they don't know what the heck is going on in their system.

hyunjimoon commented 1 year ago

"different dynamics panning out from minute change of initial values" seems like a red flag here.

Just to be clear, I didn't mean I observed different dynamics in this model. I discovered two different inflow for the same object (two matters pointing to one idea) and how serious should we be to prevent this inconsistency.

tomfid commented 1 year ago

I'm not sure how something can have two different inflows, except in the obvious case of two conceptually different things. Seems like a fatal flaw to be avoided.

tomfid commented 1 year ago

minimum_order_processing_time = normal_rng(0.05, 0.001);

This seems very small and will cause instabilities, unless time step is also very small.

tomfid commented 1 year ago

What I would normally do for a system like this is specify a "reference throughput" parameter. Then everything else can be initialized around that. Something like:

reference throughput = 10 ~ widgets/week customer orders = reference throughput + customer variation ~ widgets/week customer variation = (random or test input) ~ widgets/week WIP inventory = INTEG( production starts - shipments, (reference throughput + WIP variation)*production time ) ~ widgets WIP variation = (random) ~ widgets/week etc.

This makes the random inputs 0-mean, so the user only has to specify priors in terms of variances (though I guess you'd need a little more logic to prevent negative values in some cases).

hyunjimoon commented 1 year ago

I agree :)

I'm not sure how something can have two different inflows

This is due to the current implementation which calculates initial values using values embedded in vensim, rather than using values sampled from user-defined prior e.g. this line. @Dashadower could we defer calculating initial values numerically after the estimated parameter's value is realized? So from draws2data file, instead of the following

    real production_start_rate_stocked__init = fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) + 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) - 6 * fmax(0, 100 + 5 + 2 * 100 - 5 + 2 * 100 / 7) / 3;
    real backlog__init = 100 * 2;

could we pass on the following?

real production_start_rate_stocked__init = max(0, 
        (+ manufacturing_cycle_time * max(0, customer_order_rate + (desired_inventory - desired_inventory)/inventory_adjustment_time) 
         - manufacturing_cycle_time * max(0, customer_order_rate + (desired_inventory - desired_inventory)/inventory_adjustment_time)
        )/wip_adjustment_time
        + max(0, 
              customer_order_rate + 
               (max(0,(+ customer_order_rate * (minimum_order_processing_time + safety_stock_coverage) 
                   -  max(0, customer_order_rate * (minimum_order_processing_time + safety_stock_coverage)
                )/inventory_adjustment_time
            )
      )