stan-dev / cmdstan

CmdStan, the command line interface to Stan
https://mc-stan.org/users/interfaces/cmdstan
BSD 3-Clause "New" or "Revised" License
211 stars 93 forks source link

CmdStan on IBM Power-9 systems #1026

Open mjcarter95 opened 3 years ago

mjcarter95 commented 3 years ago

Summary:

CmdStan fails to compile models on IBM Power-9 systems.

This issue was previously discussed in the following Stan discourse post, https://discourse.mc-stan.org/t/error-running-cmdstan-on-ibm-power9-system/22060. A step-by-step solution is given in the PDF document attached to the first post - possibly worth adding to the Stan documentation?

Description:

Trying to compile a Stan model on an IBM Power-9 system results in the following error:

make ../src/models/deaths_and_111_calls

--- Translating Stan model to C++ code ---
bin/stanc  --o=../src/models/deaths_and_111_calls.hpp ../src/models/deaths_and_111_calls.stan
/bin/bash: bin/stanc: cannot execute binary file
make: *** No rule to make target `../src/models/deaths_and_111_calls.hpp', needed by `../src/models/deaths_and_111_calls'.  Stop.

CmdStan is distributed with pre-compiled stanc binary files that were built for x86 architecture, not Power-9. Because of this, we must build and configure the Stan compiler by hand.

Reproducible Steps:

Compile any Stan model on an IBM Power-9 system.

Current Output:

See description,

Expected Output:

A compiled model.

Additional Information:

NA.

Current Version:

Initially v2.23.0, persists v2.27.0

rok-cesnovar commented 3 years ago

Would this help https://github.com/stan-dev/stanc3/issues/926

the same issue is true for ARM based processors

This is false. For ARM we have a separate tarball in the last 2 releases.

mjcarter95 commented 3 years ago

Would this help stan-dev/stanc3#926

the same issue is true for ARM based processors

This is false. For ARM we have a separate tarball in the last 2 releases.

My bad, updated post.

andrjohns commented 3 years ago

Yep, as mentioned in the linked post I'll be updating our release process to automatically build binaries for powerpc systems, so this will be handled automatically soon

andrjohns commented 3 years ago

@mjcarter95 powerpc support should be available now. You can either clone the develop branch of the cmdstan repo and build, or you can just download the stanc binary from the nightly release in the stanc3 repo

mjcarter95 commented 3 years ago

@andrjohns Thank you. Just got round to trying this: I pulled the develop branch of cmdStan and am now able to compile both cmdStan and Stan models. However, when running the model "Segmentation fault" is output to the terminal with no additional information. Any ideas what the cause might be?

I am using gcc 10.2.0 and the model outlined here

./src/models/deaths_and_111_calls sample num_samples=1000 num_warmup=1000 algorithm=hmc engine=nuts max_depth=10 stepsize=0.01 adapt delta=0.8 data file=data/model_input/nhs_sheffield_ccg.data.json init=data/model_inits/nhs_sheffield_ccg/init1.json output file=output/model_fits/bede/samples1.csv
method = sample (Default)
  sample
    num_samples = 1000 (Default)
    num_warmup = 1000 (Default)
    save_warmup = 0 (Default)
    thin = 1 (Default)
    adapt
      engaged = 1 (Default)
      gamma = 0.050000000000000003 (Default)
      delta = 0.80000000000000004 (Default)
      kappa = 0.75 (Default)
      t0 = 10 (Default)
      init_buffer = 75 (Default)
      term_buffer = 50 (Default)
      window = 25 (Default)
    algorithm = hmc (Default)
      hmc
        engine = nuts (Default)
          nuts
            max_depth = 10 (Default)
        metric = diag_e (Default)
        metric_file =  (Default)
        stepsize = 0.01
        stepsize_jitter = 0 (Default)
    num_chains = 1 (Default)
id = 1 (Default)
data
  file = data/model_input/nhs_sheffield_ccg.data.json
init = data/model_inits/nhs_sheffield_ccg/init1.json
random
  seed = 2776634896 (Default)
output
  file = output/model_fits/bede/samples1.csv
  diagnostic_file =  (Default)
  refresh = 100 (Default)
  sig_figs = -1 (Default)
  profile_file = profile.csv (Default)
num_threads = 1 (Default)

Segmentation fault
andrjohns commented 3 years ago

Are you able to compile and run the bernoulli example model thats included with cmdstan?

mjcarter95 commented 3 years ago

Yes, the bernoulli example compiles and runs.

andrjohns commented 3 years ago

That's a good sign, at least. Can you share your model code and the .hpp that is generated by stan? That way I can check that the same c++ is being generated across systems.

Im assuming that the model compiled and sampled under the stanc3 that you built locally? Can you try using the stanc binary that you built previously, but with the cmdstan that you just cloned? So we can check whether the segfault is due to stanc3 or due to changes in cmdstan.

mjcarter95 commented 3 years ago

I've uploaded the model code and .hpp to Google Drive, hopefully you can access them here, please let me know if you have any trouble accessing them.

I removed the stanc3 that was built locally, so the model should have been compiled using stanc3 that is referenced in the develop branch.

andrjohns commented 3 years ago

Alright, the generated .hpp is identical to what I get locally, so this might not be a stanc3 issue. Were you able to successfully run the model when using the locally-built stanc?

Can you share some data that reproduces the issue?

mjcarter95 commented 3 years ago

Yes, I was able to run the model when using the locally built stanc (note that was coupled with the latest stable release of cmdStan and not the develop branch; I'll try this later this evening if I get chance).

I've uploaded the json data and inits to Google Drive shared earlier. Using the following to sample:

./src/models/deaths_and_111_calls sample num_samples=1000 num_warmup=1000 algorithm=hmc engine=nuts max_depth=10 stepsize=0.01 adapt delta=0.8 data file=data/model_input/nhs_sheffield_ccg.data.json init=data/model_inits/nhs_sheffield_ccg/init1.json output file=output/model_fits/bede/samples1.csv
andrjohns commented 3 years ago

Alrighty, I also get a segfault with that data, so this is a cmdstan issue not stanc3 (phew for the multiarch). I'll start digging into this now

andrjohns commented 3 years ago

It looks like the segfault is related to your initial values, because the model runs fine when they're not included. Odd.

rok-cesnovar commented 2 years ago

@mjcarter95 was this issue resolved?