Hi,
I have one variable (raw) and its rescaled version from 0 to 10 (normalized), transformed through min-max normalization formula. normalized= (raw i- min(raw)) / (max(raw) - min(raw)).
Since rawand normalizedare a linear re-parameterizations of each other, I expected the same AIC from fitting a binary phylogenetic logistic regression through phyloglm with each of them as predictor and presence as a response:
However, the two models have a very different AIC
raw$aic: 1168.742
normalized$aic: 1112.437
As a comparison, if a run a non-phylogenetic regression with glm I get the same AIC (1158).
glm(presence~raw, data = git, family = "binomial")
glm(presence~normalized, data = git, family = "binomial")
This difference in the AIC of the two models (raw and normalized) is problematic when I try to compare rawand normalized with some other predictor (let's call it other) since I get the weird situation in which, for example, rawis a better predictor than other, but normalizedis worse (when, as far I can understand, they should have the same perfomance).
Hi, I have one variable (
raw
) and its rescaled version from 0 to 10 (normalized
), transformed through min-max normalization formula.normalized
= (raw i- min(raw)) / (max(raw) - min(raw)).Since
raw
andnormalized
are a linear re-parameterizations of each other, I expected the same AIC from fitting a binary phylogenetic logistic regression throughphyloglm
with each of them as predictor andpresence
as a response:raw <- phyloglm(presence ~ raw, data = git, phy = git.tree, method = "logistic_MPLE", btol = 30) normalized <- phyloglm(presence ~ normalized, data = git, phy = git.tree, method = "logistic_MPLE", btol = 30)
However, the two models have a very different AIC raw$aic: 1168.742 normalized$aic: 1112.437
As a comparison, if a run a non-phylogenetic regression with
glm
I get the same AIC (1158).This difference in the AIC of the two models (raw and normalized) is problematic when I try to compare
raw
andnormalized
with some other predictor (let's call itother
) since I get the weird situation in which, for example,raw
is a better predictor thanother
, butnormalized
is worse (when, as far I can understand, they should have the same perfomance).I don't know what I am missing, but if someone want to the explore data they are available on my drive at this link https://drive.google.com/file/d/192lPTVECtIZZHkc7hhwyvXYDBwL5kVvP/view?usp=drive_link
Thank you so much,