StanJulia / StanSample.jl

WIP: Wrapper package for the sample method in Stan's cmdstan executable.
MIT License
18 stars 4 forks source link

dimensions of data seem to be swapped in v6.4.0 #51

Closed itsdfish closed 2 years ago

itsdfish commented 2 years ago

Hi Rob,

It seems like the rows and columns are not imported to Stan correctly... or I am doing something stupid. I recieve an error in 6.4.0 but not in 5.6.

Would you be able to look into this please? Thanks!

temp_model.stan

data { 
    int n_rows;
    int<lower=1> n_cols;
    int<lower=0> x[n_rows,n_cols]; 
}

parameters {
   real mu;
} 

model {
    mu ~ normal(0, 1);
}

run_temp_model

using StanSample, Random
tempdir = pwd() * "/tmp"
stan_path = "/home/dfish/cmdstan"
seed = 65445
Random.seed!(seed)

n_rows = 50
n_cols = 2
x = fill(0, n_rows, n_cols)

stan_data = Dict("x" => x, "n_rows"=>n_rows, "n_cols"=>n_cols)
println(size(stan_data["x"]))

# load the Stan model file
stream = open("temp_model.stan", "r")
model = read(stream, String)
close(stream)
stan_model = SampleModel("temp_model", model, tempdir)

stan_sample(
    stan_model;
    data = stan_data,
    seed,
    num_chains = 4,
    num_samples = 1000,
    num_warmups = 1000,
    save_warmup = false
)

Error

Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x; position=0; dims declared=(50,2); dims found=(2,50) (in '/home/dfish/.julia/dev/InterferenceEffectModelComparison/critical_tests/jrm/tmp/temp_model.stan', line 4, column 4 to column 34)Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x; position=0; dims declared=(50,2); dims found=(2,50) (in '/home/dfish/.julia/dev/InterferenceEffectModelComparison/critical_tests/jrm/tmp/temp_model.stan', line 4, column 4 to column 34)
goedman commented 2 years ago

Hi Chris, will take a look tonight! Two things have changed (switch to JSON and future dropping of R notation, data section definitions are changing).

Sent from my iPhone

On May 2, 2022, at 14:29, dfish @.***> wrote:

 Hi Rob,

It seems like the rows and columns are not imported to Stan correctly... or I am doing something stupid. I recieve an error in 6.4.0 but not in 5.6.

Would you be able to look into this please? Thanks!

temp_model.stan

data { int n_rows; int n_cols; int x[n_rows,n_cols]; }

parameters { real mu; }

model { mu ~ normal(0, 1); } run_temp_model

using StanSample, Random tempdir = pwd() * "/tmp" stan_path = "/home/dfish/cmdstan" seed = 65445 Random.seed!(seed)

n_rows = 50 n_cols = 2 x = fill(0, n_rows, n_cols)

stan_data = Dict("x" => x, "n_rows"=>n_rows, "n_cols"=>n_cols) println(size(stan_data["x"]))

load the Stan model file

stream = open("temp_model.stan", "r") model = read(stream, String) close(stream) stan_model = SampleModel("temp_model", model, tempdir)

stan_sample( stan_model; data = stan_data, seed, num_chains = 4, num_samples = 1000, num_warmups = 1000, save_warmup = false ) Error

Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x; position=0; dims declared=(50,2); dims found=(2,50) (in '/home/dfish/.julia/dev/InterferenceEffectModelComparison/critical_tests/jrm/tmp/temp_model.stan', line 4, column 4 to column 34)Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x; position=0; dims declared=(50,2); dims found=(2,50) (in '/home/dfish/.julia/dev/InterferenceEffectModelComparison/critical_tests/jrm/tmp/temp_model.stan', line 4, column 4 to column 34) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

goedman commented 2 years ago

Hi Chris, yes, you are correct. By switching to JSON, R's row-first and Julia's column first becomes visible. To get around it you can call rc = stan_sample(sm, data, use_json=false which for the time being is still acceptable. Or you can transpose x, i.e. in below script: x = fill(0, n_rows, n_cols)'.

🎈 chris.jl — Pluto.jl.pdf

Note: I could hide this in the call to JSON (update_json_files.jl) but want to think about that a bit.

itsdfish commented 2 years ago

Rob, thanks for explaining.

I wonder whether StanSample should make an adjustment to the JSON so that it is consistent with Julia's indexing system. In the present case, the error was caught be the dimensions were incorrect. However, the code would run and return invalid output if I used non-symmetrical n X n array.

goedman commented 2 years ago

Ok, I think I have a solution for Dicts and NamedTuples. Will do some more tests with an Array of these and merge later today. One issue was I didn't want to go into JSON.jl and NamedTuples are immutable.

itsdfish commented 2 years ago

Thanks for looking into this issue. I think it will be good to have consistency in the indexing scheme so people do not perform the wrong analysis.

goedman commented 2 years ago

Looks like it has been merged. It didn’t update the tag, but will update.

itsdfish commented 2 years ago

Thanks, Rob! The update has fixed the problem.

AndyPohlNZ commented 2 years ago

Hi Rob,

Thanks for putting work into getting stan and julia playing nice together and for the fix to the above issue. Sorry for raising this but the fix implemented does not seem to generalize to multidimensional arrays. I am using the master branch of StanSample: StanSample v6.9.2 https://github.com/StanJulia/StanSample.jl#master. See the modified example below:

using StanSample
ProjDir = @__DIR__
n1 = 50
n2 = 2
n3 = 27
x = fill(0, n1, n2, n3)

stan_data = Dict("x" => x, "n1" => n1, "n2" => n2, "n3" => n3)
println(size(stan_data["x"]))

mdl = "
data { 
    int n1;
    int<lower=1> n2;
    int<lower=1> n3;
    array[n1, n2, n3]real x;            
}

parameters {
   real mu;
} 

model {
    mu ~ normal(0, 1);
}
"

tmpdir = joinpath(ProjDir, "tmp")
isdir(tmpdir) && rm(tmpdir; recursive=true)
stan_model = SampleModel("temp_model", mdl, tmpdir)
stan_sample(
    stan_model;
    data=stan_data,
    seed=123,
    num_chains=4,
    num_samples=1000,
    num_warmups=1000,
    save_warmup=false
)

Returns the following error: Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=x; position=0; dims declared=(50,2,27); dims found=(27,2,50)