jomulder / BFpack

BFpack can be used for testing statistical hypotheses using the Bayes factor, a Bayesian criterion originally developed by sir Harold Jeffreys.
https://bfpack.info/
14 stars 4 forks source link

Issue with constraints "with no common boundary point" #24

Closed tpourmohamad closed 1 month ago

tpourmohamad commented 3 years ago

I'm not sure what to call this issue, but it seems that the code breaks for constraints that are symmetric. For example,

library(BFpack)

set.seed(4)

n <- 100
x <- rnorm(n, 50)
y <- rnorm(n, 53)

z <- c(x,y)
w <- rep(c("x","y"),each = n)

model = lm(z ~ w)

hypothesis <- "-5 < wy & wy < 5"
BF(model, hypothesis = hypothesis, complement = TRUE)

hypothesis <- "-6 < wy & wy < 5"
BF(model, hypothesis = hypothesis, complement = TRUE)

You will see that constraining the slope term between -5 and 5 (i.e., symmetry) breaks, but constraining it between -6 and 5 does not. Am I missing something here?

cjvanlissa commented 3 years ago

@jomulder the hypotheses are parsed correctly, so this bug is in the BFpack estimator

> bain:::parse_hypothesis("wy", hypothesis)
$hyp_mat
$hyp_mat[[1]]
      wy   
-5<wy  1 -5
wy<5  -1 -5

$n_constraints
[1] 0 2

$n_ec
[1] 0

$original_hypothesis
[1] "-5<wy&wy<5"

> 
> hypothesis <- "-6 < wy & wy < 5"
> bain:::parse_hypothesis("wy", hypothesis)
$hyp_mat
$hyp_mat[[1]]
      wy   
-6<wy  1 -6
wy<5  -1 -5

$n_constraints
[1] 0 2

$n_ec
[1] 0

$original_hypothesis
[1] "-6<wy&wy<5"
jomulder commented 3 years ago

Dear tpourmohamad,

Thanks for your message. Indeed the bugs was caused by an issue with symmetrical bounds which is now fixed in the developmental version on github. Thanks for spotting this and reporting it.

Best, Joris

tpourmohamad commented 3 years ago

Hi Joris,

Thank you for being so prompt! The package is awesome and definitely intend to use/cite it.

Best, Tony

On Mon, Mar 1, 2021 at 7:08 AM jomulder notifications@github.com wrote:

Dear tpourmohamad,

Thanks for your message. Indeed the bugs was caused by an issue with symmetrical bounds which is now fixed in the developmental version on github. Thanks for spotting this and reporting it.

Best, Joris

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jomulder/BFpack/issues/24#issuecomment-788020479, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALP4NV4RNY7ZX2MF7DER3VLTBOUVTANCNFSM4YDQPEIQ .

jomulder commented 3 years ago

Hi Tony,

Good to hear that you like it! And thanks again for spotting and reporting these bugs. If you spot more please let me know of course.

Thanks, Joris

tpourmohamad commented 3 years ago

Hi Joris and Caspar,

I had a follow-up question that is more fundamental in nature. I am fitting a linear regression model where there are four parameters to be estimated. I want to test two constrained hypotheses, of which one is a special case of the other, i.e.,

BF_freq <- BF(m_freq, hypothesis = "(grps3, grps4) > grps2 > grps1; grps4 > grps3 > grps2 > grps1; grps1 ,grps2, grps3, grps4", complement = FALSE) summary(BF_freq)

When running this code, I get the following output:

Bayesian hypothesis test Type: confirmatory Object: lm Parameter: regression coefficients Method: generalized adjusted fractional Bayes factors

Posterior probabilities: Pr(hypothesis|data) H1 0.332 H2 0.639 H3 0.028

Now, my fundamental question is whether or not Pr(H1 | Data) excludes the case of grps4 > grps3 > grps2 > grps1 (since that is H2) or does some bit of that constraint also contribute to Pr(H1 | Data)? As in, if the true relationship is grps4 > grps3 > grps2 > grps1, then shouldn't Pr(H1 | Data) be as likely as Pr(H2|Data) since H2 is a special case of H1? I assume hypotheses do not need to be mutually exclusive of one another, but then those probabilities are a bit confusing for me to interpret. Let me know if my question is unclear or if further clarification is needed. Thanks!

Best, Tony

On Mon, Mar 1, 2021 at 7:24 AM jomulder notifications@github.com wrote:

Hi Tony,

Good to hear that you like it! And thanks again for spotting and reporting these bugs. If you spot more please let me know of course.

Thanks, Joris

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jomulder/BFpack/issues/24#issuecomment-788032823, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALP4NV4LIDJXXLGQILU6LSLTBOWUVANCNFSM4YDQPEIQ .

tpourmohamad commented 3 years ago

Dear Tony,

Thanks for your email. As a short answer, BFpack does not automatically cut out subregions if they are overlapping between two hypotheses. Only the complement hypothesis covers the remaining parameter space that is not covered by the hypotheses. Personally I would not recommend testing hypotheses that are overlapping unless you have a very good reason based on substantive grounds. For example, one theory may be a special case of another theory and one want to take that explicitly into account in the hypothesis test. The reason that I would not recommend to include nested hypotheses in general because the BF would have an upperbound for the “smaller” hypothesis. As a simple example, if we would test H1: grps1>grps2 versus H2: grps1, grps2 (i.e., the unconstrained model), then, if the evidence points clearly towards H1, the Bayes factor of H1 against H2 would be maximally 2, and the posterior probabilities would be 2/3 and 1/3, respectively. This suggests that the evidence for H1 is not conclusive even though the data clearly points towards H1, which is not a good property in my opinion. Instead I would test in this case H1: grps1>grps3 against H2: complement. Then the BF of H1 against its complement would be extremely large and the post. probs. will be 1 and 0, respectively, which is much more insightful about the true state of the the parameters.

Because the nesting also occurs in your setting a similar behavior will be observed, which can also be seen in the results. In your application, H2 is nested in H1, and H1 is nested in H3. Based on the above rational, I would instead test

hypothesis = “grps3 > gprs4 > grps2 > grps1; grps4 > gprs3 > grps2 > grps1“

and I would include the complement and remove the unconstrained model. I expect that the BFs and posterior probabilities then better reflect what is going on, namely that there is a lot of evidence for grps4 > gprs3 > grps2 > grps1.

I hope this helps.

Best, Joris

hypothesis = "(grps3, grps4) > grps2 > grps1; grps4 > grps3 > grps2 > grps1;

On 9 Mar 2021, at 05:13, Tony Pourmohamad tpourmohamad@gmail.com wrote:

 Hi Joris and Caspar,

I had a follow-up question that is more fundamental in nature. I am fitting a linear regression model where there are four parameters to be estimated. I want to test two constrained hypotheses, of which one is a special case of the other, i.e.,

BF_freq <- BF(m_freq, hypothesis = "(grps3, grps4) > grps2 > grps1; grps4 > grps3 > grps2 > grps1; grps1 ,grps2, grps3, grps4", complement = FALSE) summary(BF_freq)

When running this code, I get the following output:

Bayesian hypothesis test Type: confirmatory Object: lm Parameter: regression coefficients Method: generalized adjusted fractional Bayes factors

Posterior probabilities: Pr(hypothesis|data) H1 0.332 H2 0.639 H3 0.028

Now, my fundamental question is whether or not Pr(H1 | Data) excludes the case of grps4 > grps3 > grps2 > grps1 (since that is H2) or does some bit of that constraint also contribute to Pr(H1 | Data)? As in, if the true relationship is grps4 > grps3 > grps2 > grps1, then shouldn't Pr(H1 | Data) be as likely as Pr(H2|Data) since H2 is a special case of H1? I assume hypotheses do not need to be mutually exclusive of one another, but then those probabilities are a bit confusing for me to interpret. Let me know if my question is unclear or if further clarification is needed. Thanks!

Best, Tony

On Mon, Mar 1, 2021 at 7:24 AM jomulder notifications@github.com<mailto:notifications@github.com> wrote:

Hi Tony,

Good to hear that you like it! And thanks again for spotting and reporting these bugs. If you spot more please let me know of course.

Thanks, Joris

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/jomulder/BFpack/issues/24#issuecomment-788032823, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALP4NV4LIDJXXLGQILU6LSLTBOWUVANCNFSM4YDQPEIQ.

tpourmohamad commented 3 years ago

Hi Joris,

Thank you for the excellent explanation. It is exactly what I thought would happen, so thank you for confirming my beliefs.

Best, Tony

On Tue, Mar 9, 2021 at 12:49 AM J. Mulder J.Mulder3@tilburguniversity.edu wrote:

Dear Tony,

Thanks for your email. As a short answer, BFpack does not automatically cut out subregions if they are overlapping between two hypotheses. Only the complement hypothesis covers the remaining parameter space that is not covered by the hypotheses. Personally I would not recommend testing hypotheses that are overlapping unless you have a very good reason based on substantive grounds. For example, one theory may be a special case of another theory and one want to take that explicitly into account in the hypothesis test. The reason that I would not recommend to include nested hypotheses in general because the BF would have an upperbound for the “smaller” hypothesis. As a simple example, if we would test H1: grps1>grps2 versus H2: grps1, grps2 (i.e., the unconstrained model), then, if the evidence points clearly towards H1, the Bayes factor of H1 against H2 would be maximally 2, and the posterior probabilities would be 2/3 and 1/3, respectively. This suggests that the evidence for H1 is not conclusive even though the data clearly points towards H1, which is not a good property in my opinion. Instead I would test in this case H1: grps1>grps3 against H2: complement. Then the BF of H1 against its complement would be extremely large and the post. probs. will be 1 and 0, respectively, which is much more insightful about the true state of the the parameters.

Because the nesting also occurs in your setting a similar behavior will be observed, which can also be seen in the results. In your application, H2 is nested in H1, and H1 is nested in H3. Based on the above rational, I would instead test

hypothesis = “grps3 > gprs4 > grps2 > grps1; grps4 > gprs3 > grps2 > grps1“

and I would include the complement and remove the unconstrained model. I expect that the BFs and posterior probabilities then better reflect what is going on, namely that there is a lot of evidence for grps4 > gprs3 > grps2 > grps1.

I hope this helps.

Best, Joris

hypothesis = "(grps3, grps4) > grps2 > grps1; grps4 > grps3 > grps2 > grps1;

On 9 Mar 2021, at 05:13, Tony Pourmohamad tpourmohamad@gmail.com wrote:

 Hi Joris and Caspar,

I had a follow-up question that is more fundamental in nature. I am fitting a linear regression model where there are four parameters to be estimated. I want to test two constrained hypotheses, of which one is a special case of the other, i.e.,

BF_freq <- BF(m_freq, hypothesis = "(grps3, grps4) > grps2 > grps1; grps4 > grps3

grps2 > grps1; grps1 ,grps2, grps3, grps4", complement = FALSE) summary(BF_freq)

When running this code, I get the following output:

Bayesian hypothesis test Type: confirmatory Object: lm Parameter: regression coefficients Method: generalized adjusted fractional Bayes factors

Posterior probabilities: Pr(hypothesis|data) H1 0.332 H2 0.639 H3 0.028

Now, my fundamental question is whether or not Pr(H1 | Data) excludes the case of grps4 > grps3 > grps2 > grps1 (since that is H2) or does some bit of that constraint also contribute to Pr(H1 | Data)? As in, if the true relationship is grps4 > grps3 > grps2 > grps1, then shouldn't Pr(H1 | Data) be as likely as Pr(H2|Data) since H2 is a special case of H1? I assume hypotheses do not need to be mutually exclusive of one another, but then those probabilities are a bit confusing for me to interpret. Let me know if my question is unclear or if further clarification is needed. Thanks!

Best, Tony

On Mon, Mar 1, 2021 at 7:24 AM jomulder notifications@github.com wrote:

Hi Tony,

Good to hear that you like it! And thanks again for spotting and reporting these bugs. If you spot more please let me know of course.

Thanks, Joris

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jomulder/BFpack/issues/24#issuecomment-788032823, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALP4NV4LIDJXXLGQILU6LSLTBOWUVANCNFSM4YDQPEIQ .