stan-dev / cmdstanr

CmdStanR: the R interface to CmdStan
https://mc-stan.org/cmdstanr/
Other
144 stars 63 forks source link

Occasional segfault when returning tuple as R list from exposed function #1001

Open adamgorm opened 5 months ago

adamgorm commented 5 months ago

I have written several Stan functions that return tuples (as R lists). Most of the time they work as intended, but occasionally R crashes due to a segfault with a message such as

 *** caught segfault ***
address 0x5647a22ee788, cause 'memory not mapped'

Similarly, I have also had core dumps with the messages corrupted double-linked list or malloc(): unsorted double linked list corrupted.

I have not been able to get it to happen in a reproducible manner - it seems like the same input can sometimes lead to successful execution and sometimes to a segfault.

I use Fedora 40, R 4.4.0, cmdstan 2.35.0, cmdstanr 0.8.1.

As an example, one of my functions that occasionally gives this issue is

  tuple(real, vector, matrix)
  prep_multi_cens_log_lik(vector yo,
                          array[] real to, array[] real tc,
                          array[] int to_is, array[] int tc_is,
                          array[] int Jo, array[] int Jc,
                          real magnitude_mu, real length_scale_mu,
                          real magnitude_eta, real length_scale_eta,
                          real sigma)
  {
    int n = num_elements(Jc); /* number of groups */
    int No = sum(Jo);
    int Nc = sum(Jc);

    int row_start, row_end, col_start, col_end;

    matrix[Nc, Nc] cov_c;
    row_end = 0;
    for (i in 1:n) {
      if (Jc[i] == 0)
        continue;
      row_start = row_end + 1;
      row_end = row_end + Jc[i];
      col_end = 0;
      for (j in 1:n) {
        if (Jc[j] == 0)
          continue;
        col_start = col_end + 1;
        col_end = col_end + Jc[j];
        if (i == j) {
          cov_c[row_start:row_end, col_start:col_end] =
            add_diag(gp_exp_quad_cov(tc[row_start:row_end], tc[col_start:col_end],
                                     magnitude_mu, length_scale_mu) +
                     gp_exp_quad_cov(tc[row_start:row_end], tc[col_start:col_end],
                                     magnitude_eta, length_scale_eta),
                     sigma^2);
        } else {
          cov_c[row_start:row_end, col_start:col_end] =
            gp_exp_quad_cov(tc[row_start:row_end], tc[col_start:col_end],
                            magnitude_mu, length_scale_mu) -
            gp_exp_quad_cov(tc[row_start:row_end], tc[col_start:col_end],
                            magnitude_eta, length_scale_eta) / (n-1);
        }
      }
    }

    matrix[Nc, No] cov_c_o;
    row_end = 0;
    for (i in 1:n) {
      if (Jc[i] == 0)
        continue;
      row_start = row_end + 1;
      row_end = row_end + Jc[i];
      col_end = 0;
      array[Jc[i]] real tc_slice = tc[row_start:row_end];
      for (j in 1:n) {
        if (Jo[j] == 0)
          continue;
        col_start = col_end + 1;
        col_end = col_end + Jo[j];
        array[Jo[j]] real to_slice = to[col_start:col_end];
        if (i == j) {
          cov_c_o[row_start:row_end, col_start:col_end] =
            gp_exp_quad_cov(tc_slice, to_slice,
                            magnitude_mu, length_scale_mu) +
            gp_exp_quad_cov(tc_slice, to_slice,
                            magnitude_eta, length_scale_eta);
        } else {
          cov_c_o[row_start:row_end, col_start:col_end] =
            gp_exp_quad_cov(tc_slice, to_slice,
                            magnitude_mu, length_scale_mu) -
            gp_exp_quad_cov(tc_slice, to_slice,
                            magnitude_eta, length_scale_eta) / (n-1);
        }
      }
    }

    matrix[No, No] cov_o;
    row_end = 0;
    for (i in 1:n) {
      if (Jo[i] == 0)
        continue;
      row_start = row_end + 1;
      row_end = row_end + Jo[i];
      col_end = 0;
      for (j in 1:n) {
        if (Jo[j] == 0)
          continue;
        col_start = col_end + 1;
        col_end = col_end + Jo[j];
        if (i == j) {
          cov_o[row_start:row_end, col_start:col_end] =
            add_diag(gp_exp_quad_cov(to[row_start:row_end], to[col_start:col_end],
                                     magnitude_mu, length_scale_mu) +
                     gp_exp_quad_cov(to[row_start:row_end], to[col_start:col_end],
                                     magnitude_eta, length_scale_eta),
                     sigma^2);
        } else {
          cov_o[row_start:row_end, col_start:col_end] =
            gp_exp_quad_cov(to[row_start:row_end], to[col_start:col_end],
                            magnitude_mu, length_scale_mu) -
            gp_exp_quad_cov(to[row_start:row_end], to[col_start:col_end],
                            magnitude_eta, length_scale_eta) / (n-1);
        }
      }
    }

    real lpdf_o = multi_normal_lpdf(yo | rep_vector(0, No), cov_o);
    vector[Nc] mean_c_cond_o = cov_c_o * (cov_o \ yo);
    matrix[Nc, Nc] cov_c_cond_o = cov_c - cov_c_o * (cov_o \ cov_c_o');

    return (lpdf_o,
            mean_c_cond_o,
            cov_c_cond_o);
  }  

which I expose to R using

stan_funs <- cmdstanr::cmdstan_model("functions.stan",
                                     force_recompile = TRUE,
                                     compile_standalone = TRUE)$functions