Reasons for log-transformation

chonlei commented 4 years ago

I think it's a bit unfair to say we don't know the reason for log-transformation. We do know quite a bit why it helps...

positively constrained to unconstrained
skewed distributions to less skewed
similar variance for all parameters

I think these are all good reasons to mention already!

MichaelClerx commented 4 years ago

Tell me more! I don't usually think about optimisation from a stats point of view, so the first point makes the search space bigger and the second 2 I don't know why that would help?

MichaelClerx commented 4 years ago

So imagine I'm doing a gradient-descent algorithm, or a simplex method. Where is the "distribution" in this problem and why do I care about its skewedness?

Similar scales for each param and each derivative helps some methods (i.e. ones that don't already account for that), but what do you mean by similar variance? And does the log transform do this just because it flattens everything ?

That's what I was thinking to say, that repeated log transforms make the function flatter / more boring (I think we need to talk about repeated application here, because we don't know what the "true" space looks like: So there might already have been a log transform and we'll need to show that it's still useful to apply one again!)

chonlei commented 4 years ago

First point makes the space 'smoother', you won't hit that 'big big barrier' to start with (certainly better for methods like MCMC, maybe it is not as attractive for an optimisation problem).

And for the second point, which is more important, imagine one was a 'one sawtooth function' (very skewed) and the other was a lovely ideal quadratic function (zero skewness) that everyone likes. I suppose even a gradient-descent algorithm likes the latter -- it'd not be able to search at all if it was on the flat side of the function! (Well, I said "distribution", but it is the same for any "function" that you want to optimise.)

Yes, if you prefer the term 'scale'.

And does the log transform do this just because it flattens everything?

And the function more boring the better! (Why would you even want excitement in an optimisation?) More boring -> 'more smooth', and anything smooth is good here. :-D

Obviously after all these, I am not (and we should not be) saying just stick a log-transformation to any parameter for any problem that one has! Never do that... but as you said, when it's needed, and in some cases (like for the points above), then consider a transformation, not necessary log-transformation, but some time log-transformation can be helpful! I guess that's the message?

chonlei commented 4 years ago

In terms of the second point, I think a good example of that is Figure 8 in the Calibration of ionic and cellular cardiac electrophysiology model WIREs review, no?! And this example notebook shows the same point/issue.

MichaelClerx commented 4 years ago

I'm not sure I'd call that skewedness? It's not like it was originally a normal distribution that we then skewed somehow?

MichaelClerx commented 4 years ago

I'm happy with what we have currently (doens't say we don't know why). Might add more when/if we do an AP example, and/or when new pints transformations are in!

CardiacModelling / fitting-notebooks

Reasons for log-transformation #9