stan-dev / posteriordb

Database with posteriors of interest for Bayesian inference
176 stars 36 forks source link

Simplify model and data info files #131

Open eerolinna opened 4 years ago

eerolinna commented 4 years ago

Currently model and data info files mix two unrelated kinds of information

In models/info/eight_schools_noncentered.json

{
  "name": "eight_schools_noncentered",
  "title": "A non-centered hiearchical model for 8 schools",
  "description": "A non-centered hiearchical model for the 8 schools example of Rubin (1981)",
  "keywords": [
    "hiearchical"
  ],
  "references": [
    "rubin1981estimation",
    "gelman2013bayesian"
  ],
  "urls": [
    "http://www.stat.columbia.edu/~gelman/arm/examples/schools"
  ],
  "prior": {
    "keywords": []
  },
  "model_implementations": {
    "stan": {
      "model_code": "models/stan/eight_schools_noncentered.stan"
    }
  },
  "added_by": "Mans Magnusson",
  "added_date": "2019-08-12"
}

model_implementations and arguably name fall under the internal workings of posteriordb while the others are information intended for the users.

I propose a new format

{
  "name": "eight_schools_noncentered",
  "model_implementations": {
    "stan": {
      "model_code": "models/stan/eight_schools_noncentered.stan"
    }
  },
  "information": {
    "title": "A non-centered hiearchical model for 8 schools",
    "description": "A non-centered hiearchical model for the 8 schools example of Rubin (1981)",
    "keywords": [
      "hiearchical"
    ],
    "references": [
      "rubin1981estimation",
      "gelman2013bayesian"
    ],
    "urls": [
      "http://www.stat.columbia.edu/~gelman/arm/examples/schools"
    ],
    "prior": {
      "keywords": []
    },
    "added_by": "Mans Magnusson",
    "added_date": "2019-08-12"
  }
}

where the information intended for the user is under slot information. When calling model.information (python) or info (R) only the information slot would be returned

MansMeg commented 4 years ago

I think the idea is good. Although I dont see a very large benefit for now. So I suggest we wait with this and see if it is nessecary further along.

eerolinna commented 4 years ago

I think the benefit is clearer semantics. I don't know if waiting will be helpful here as these kind of changes really never become strictly necessary. Or perhaps a better word would be that these kinds of changes never become urgent.

If you think this would be good but don't want to spend time to do this I can probably update everything but the R library in 30 minutes. (30 minutes of effort I mean, not that I would do it right away. I probably wouldn't do this for a bit)

If you don't actually think this is useful but were just being polite we can wait and in the future see if the situation changes. Or if you are not sure if this is a good change and don't have time to think about this now we can also wait.