biocore / BIRDMAn

Bayesian Inferential Regression for Differential Microbiome Analysis
BSD 3-Clause "New" or "Revised" License
22 stars 5 forks source link

Temporary files usage #36

Closed mortonjt closed 3 years ago

mortonjt commented 3 years ago

One thing that I've been noticing is that right after a CmdStanMCMC session exits, all of the temporarily files are deleted. So if one wishes to perform diagnostics or create InferenceData objects in a cluster setting, they won't be able to.

There are two approaches to fixing this problem

  1. Figure out how to not use temporary files with Stan by specifying the output_directory
  2. Create InferenceData objects directly the CmdStanMCMC objects in the same session, and pass around InferenceData objects instead.

I've gotten the 2nd strategy working on q2-differential. The problem here is that it will be more difficult to perform cross-validation down the road (which is still pending due to https://github.com/stan-dev/cmdstanpy/issues/345, https://github.com/stan-dev/cmdstanpy/issues/251)

gibsramen commented 3 years ago

For (1) can we pass output_dir into the sampler_args dict of Model.fit?

mortonjt commented 3 years ago

I should think so, but other problems arise if output_dir is specified; for some reason the csv files aren't be parsed correctly. I'm getting a ValueError: line 463: bad draw, expecting 137 items, found 37 (full traceback on https://github.com/mortonjt/q2-differential/pull/19#issuecomment-828833786).

It may be a problem in Stan; will need to get a more minimal example to help with debugging.

mortonjt commented 3 years ago

Closing this, since this is no longer relevant with the new architecture