1313e / PRISM

An alternative to MCMC for rapid analysis of models
https://prism-tool.readthedocs.io
BSD 3-Clause "New" or "Revised" License
43 stars 9 forks source link

PRISM for spectra #17

Closed lazygun37 closed 5 years ago

lazygun37 commented 5 years ago

Hi,

I'm a total novice when it comes to the methods PRISM uses, but I'm quite interested in using it to speed up the model exploration/fitting associated with our radiative transfer code. Briefly, we have a Monte Carlo code which predicts observed spectra. The issue is that the code is slow -- depending on the application, a single run can take minutes to many hours. So direct model fitting to observational data is almost impossible.

Do you think PRISM would be a useful way to speed things up? I can see three main issues here:

  1. Spectra can have thousands of data points. Given you PRISM uses a separate emulator system for each data point (as I understand it), is it feasible to use it in this way?
  2. When fitting real spectra with models, the worst discrepancies are often "systematic" (basically what you model discrepancy variance, I believe). The trouble is that, in the context of spectra, these variances are often compressed into distinct regions of the data set -- e.g. specific atomic transitions are badly fit by just about every set of model parameters. I'm not sure, but it sounds like that would make using "maximum implausibility" -- even something like I_min,2 or whatever -- quite tricky to use.
  3. Given just how slow our code us, is there a way to minimize the number of new, full function evaluations beyond the default level PRISM would provide?

Thanks for your help!

Cheers,

Christian

1313e commented 5 years ago

Hi @lazygun37,

thanks for your questions.

PRISM's methodology was specifically made to be powerful when used in situations where limited knowledge (observational data) is available. Therefore, using too many data points will reduce its usefulness, as it will create an emulator system for every data point (as you already pointed out). However, depending on how big the emulator systems are going to be and how long it normally takes to evaluate the model, even then it can still be useful to use PRISM for this (it will just not reach the few thousand evaluations a second that it normally has). In this scenario, it would create quite a few HDF5-files though (as every emulator system has its own HDF5-file).

The solution to this is to only use specific data points of your spectra, selected for their accuracy and importance to the overall results. As PRISM is mainly an exploratory analysis tool that uses approximations, it cannot do proper parameter estimation on its own, but rather enhances it through hybrid sampling. If that is what you are looking for, I think that PRISM will be able to do a good job.

The model discrepancy variance would indeed then be dominated by these specific atomic transitions. In that case, you would simply return a high variance whenever a data point is requested there. You can add as many wildcards to the implausibility cut-offs as you want to account for some of these effects early on. Keep in mind though that PRISM's job is to become as accurate as the model, and use that information to look for plausible model realizations. If your model is fairly inaccurate for some data points, the emulator will be just as inaccurate there as well and therefore must account for this. What I am trying to say here is that PRISM cannot make you an emulator that does a BETTER job than your own model.

Uhm, I don't really understand your third question. What exactly do you mean by "minimizing the number of new, full function evaluations beyond the default level PRISM would provide"?