Closed bogaotory closed 3 years ago
You have the right idea. Of course, you would not bother to evaluate both likelihood and log likelihood first if you only wanted one of these. Also, the dmeasure function does not return a value, so the code would look more like:
if (give_log) {
lik = <compute log_likelihood>;
} else {
lik = <compute likelihood>;
}
With respect to your second request, I'm not sure exactly what you want to know, but here are a couple of reflections.
log
and exp
are imperfect maps between the two spaces because, on a finite machine, they lose precision. If they were perfect maps, it would clearly be sufficient to ask the user to provide only one.ddist
functions have a give_log
parameter.
pomp merely recapitulates this behavior, for the same reason.Thanks for the prompt reply.
I guess my question is when do pomp's algorithms such as mif2
decide to calculate the log likelihood instead of likelihood and vice versa? The user implement dmeasure
such that it may return log likelihood when give_log
is TRUE, and return likelihood when give_log
is FALSE. But the dmeasure
function is called by algorithms such as mif2
, not by the user. So when dmeasure
is called, I guess pomp needs to make a decision as to what the value of give_log
is? And how is that decision made?
Yes, pomp makes that decision. The decision depends on the algorithm. The user must provide for both cases. To figure out when it is called with what value, you can always refer to the source (e.g., here). But the point is that the user doesn't need to worry about that. pomp is designed to be more than a set of data analysis tools: it is also a platform for the implementation of inference algorithms. If there is a chance that a future algorithm would want a different choice than an existing algorithm, both possibilities must be allowed for. I suppose one could even imagine an algorithm that made the choice at runtime or even changed the choice adaptively. At the moment, however, each pomp algorithm has its choice hard-coded.
Permit me to ask, why are these matters of interest to you?
Thanks for pointing me to the right place in the source code. I was looking at dmeasure.c
for the answer because I thought there might be a function in there to control the decision at runtime.
Of course, happy to share my thoughts. Principally, it is of interest because I wanted to know exactly how everything works, and to make sure that there is no misunderstandings at the interface between my model implementation and pomp.
More specifically, the paragraph on dmeasure
was a little confusing to read if I may say so. As in I get the messages after reading it, but I am not quite sure if you mean what I thought you meant, if that makes sense. Since you emphasised "This is one of the most common places where newbies (like me) make mistakes", I didn't want to take any chances, and thought I'd better get some confirmation/clarification if possible.
As I started reading this section on dmeasure
, after reading the code, this give_log
variable seemed to have came out of nowhere. The standard dpois
function, as you referenced, has a parameter log
, but why is it called give_log
here I started to wonder. Could it be a typo or is it a versioning issue? I started to have doubts from here on. If you'd wrote something like lik = dpois(pop,b*N,log=give_log);
it would make more sense, but that's not really a valid C syntax I suppose.
Also a different way to interpret the first half of this paragraph, as I did for a long moment, was that you were saying that it is the user's duty to write either lik = dpois(pop,b*N,0);
or lik = dpois(pop,b*N,1);
. The second half of the paragraph contradicts this assumption, but it also did not clarify what the "needs" are as in "... according to its needs." So now I have doubts for both halves of the paragraph, and can't decide which to go with.
If I may make a couple of suggestions, it would be to 1. declare right after the code that give_log
is a variable local to pomp. (You've made it quite clear why Csnippet is the preferred method at the beginning of the tutorial, but I have to admit I get headaches reading R's syntax and it is made even worse when Csnippets are involved. The biggest headache I have is I simply don't know where everything is defined and what their scopes are.) 2. clarify "...according to its needs" with an example such as mif2
always uses log likelihood because ... (well I don't really know the reason :) ).
Thanks very much for your help again.
Thanks for your suggestions, @bogaotory. I will spend some time with the relevant paragraph in the "Getting Started" document. More generally, figuring out how to communicate the demands that are placed on the user without burdening him or her with extraneous information has been a challenge. Clearly, there's room for improvement.
The standard
dpois
function, as you referenced, has a parameterlog
, but why is it calledgive_log
here I started to wonder. Could it be a typo or is it a versioning issue? I started to have doubts from here on. If you'd wrote something likelik = dpois(pop,b*N,log=give_log);
it would make more sense, but that's not really a valid C syntax I suppose.
Yes, it's unfortunate, but log
is defined in the standard C math library, so it can't be used as a variable name in C snippets. If its any consolation, the same convention is found in the R source code (e.g., dpois
).
https://github.com/kingaa/pomp/blob/467720ba63418590a5a96751cf51acf312f58d47/vignettes/getting_started.Rmd#L731-L737
My (A newbie's) understanding of the
give_log
(log
in the case of R-defineddmeasure
function) parameter is that essentially thedmeasure
function needs to have a structure that looks something like this:Is this correct? Could you also point me to the right direction to learn bout where the
give_log
parameter/flag fit inside the algorithm please? As in when does the algorithm decide to evaluate "log likelihood" instead of "likelihood" and vice versa? (It always seem to be FALSE when I ran mydmeasure
withmif2
). Thanks very much.