Closed bmcclintock closed 8 years ago
I think this might have to do with a common choice Devin makes for the priors on Beta - namely he sets them to
Beta ~ MVN(0,\tau (X'X)^{-1})
to scale them to the priors relative magnitude of the design matrix and improve mixing. I think this is something Jennifer Hoeting has advocated, if I remember Devin's explanation correctly. It's also a strategy I've adopted, so would be interested to hear what issues you've had with it!
-P
On Wed, Apr 15, 2015 at 4:02 PM, bmcclintock notifications@github.com wrote:
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin
should probably take a look and confirm.
You can view, comment on, or merge this pull request online at:
https://github.com/jlaake/marked/pull/4 Commit Summary
- fix bug in Gibbs sampler for betas
File Changes
- M marked/R/probitCJS.R https://github.com/jlaake/marked/pull/4/files#diff-0 (10)
Patch Links:
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
Have you tried this approach with \tau > 0.01? I've been running probitCJS() with \tau = 1 (or 0.1), which seem like perfectly reasonable ''barely-informative'' priors for the probit link. With simulated data (\phi ~ 0.95, p ~ 0.4), I get horrifically lousy estimates.
Here it is for the good 'ol dipper data:
data(dipper)
fit1 = crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)), burnin=5000, iter=50000)
fit1$results$reals
$Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5802428 0.6157544 0.1067098 0.4318847 0.8961070 2 5 5 0.5983627 0.6113757 0.1072963 0.4048039 0.9193108 3 4 4 0.6105205 0.6246322 0.1225496 0.4251695 0.9575122 4 3 3 0.4691240 0.5610643 0.1787757 0.3553289 0.9784923 5 2 2 0.4740408 0.6467389 0.2031592 0.3841401 0.9999156 6 1 1 0.9954080 0.8256072 0.1657278 0.5282918 1.0000000
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.9830198 0.8259658 0.1661635 0.3839609 0.9999998 2 6 6 0.9378322 0.8527325 0.1708688 0.3561641 0.9971572 3 5 5 0.9214020 0.8026059 0.1934767 0.3238298 0.9821723 4 4 4 0.9126367 0.7238430 0.2315205 0.2885504 0.9832404 5 3 3 0.8958076 0.6543822 0.2132083 0.3325620 0.9910540 6 2 2 0.5698856 0.6154969 0.1569900 0.3370045 0.9175915
fit2 = crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time, prior=list(mu=rep(0,6), tau=1)),p=list(formula=~time, prior=list(mu=rep(0,6), tau=1))), burnin=5000, iter=50000)
fit2$results$reals
$Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5653200 0.5638793 0.03237527 0.5006367 0.6271644 2 5 5 0.5581776 0.5599721 0.03350454 0.4946596 0.6258098 3 4 4 0.5543651 0.5534975 0.03469824 0.4839541 0.6193838 4 3 3 0.5129871 0.5143534 0.03648440 0.4418147 0.5840749 5 2 2 0.5135764 0.5177559 0.04252288 0.4340589 0.5998533 6 1 1 0.5762414 0.5766708 0.06972554 0.4398088 0.7116584
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.5869295 0.5912162 0.03911119 0.5135447 0.6670067 2 6 6 0.6131401 0.6162346 0.04041677 0.5362677 0.6940604 3 5 5 0.6119031 0.6182684 0.04223474 0.5341609 0.6992560 4 4 4 0.5974503 0.5972898 0.04822438 0.5052169 0.6946310 5 3 3 0.5960049 0.5939367 0.05625055 0.4836240 0.7040048 6 2 2 0.5867878 0.5843892 0.08063812 0.4282024 0.7412024
p is supposed to be around 0.9 for these data. Scary! Or maybe I'm missing something here?
Brett
On 4/15/2015 4:16 PM, Paul Conn wrote:
I think this might have to do with a common choice Devin makes for the priors on Beta - namely he sets them to
Beta ~ MVN(0,\tau (X'X)^{-1})
to scale them to the priors relative magnitude of the design matrix and improve mixing. I think this is something Jennifer Hoeting has advocated, if I remember Devin's explanation correctly. It's also a strategy I've adopted, so would be interested to hear what issues you've had with it!
-P
On Wed, Apr 15, 2015 at 4:02 PM, bmcclintock notifications@github.com wrote:
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin
should probably take a look and confirm.
You can view, comment on, or merge this pull request online at:
https://github.com/jlaake/marked/pull/4 Commit Summary
- fix bug in Gibbs sampler for betas
File Changes
- M marked/R/probitCJS.R https://github.com/jlaake/marked/pull/4/files#diff-0 (10)
Patch Links:
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93594868.
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
I see that \tau=1 makes for a very informative prior using the \tau (X'X)^{-1} approach, but because X'X for this model depends on the latent variables (z) at each iteration, it seems it will be pretty tough for folks to specify priors for Beta.
Brett
On 4/15/2015 5:49 PM, Brett McClintock - NOAA Federal wrote:
Have you tried this approach with \tau > 0.01? I've been running probitCJS() with \tau = 1 (or 0.1), which seem like perfectly reasonable ''barely-informative'' priors for the probit link. With simulated data (\phi ~ 0.95, p ~ 0.4), I get horrifically lousy estimates.
Here it is for the good 'ol dipper data:
data(dipper)
Default prior distributions \tau=0.01:
fit1 = crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)), burnin=5000, iter=50000)
Real parameter summary
fit1$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5802428 0.6157544 0.1067098 0.4318847 0.8961070 2 5 5 0.5983627 0.6113757 0.1072963 0.4048039 0.9193108 3 4 4 0.6105205 0.6246322 0.1225496 0.4251695 0.9575122 4 3 3 0.4691240 0.5610643 0.1787757 0.3553289 0.9784923 5 2 2 0.4740408 0.6467389 0.2031592 0.3841401 0.9999156 6 1 1 0.9954080 0.8256072 0.1657278 0.5282918 1.0000000
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.9830198 0.8259658 0.1661635 0.3839609 0.9999998 2 6 6 0.9378322 0.8527325 0.1708688 0.3561641 0.9971572 3 5 5 0.9214020 0.8026059 0.1934767 0.3238298 0.9821723 4 4 4 0.9126367 0.7238430 0.2315205 0.2885504 0.9832404 5 3 3 0.8958076 0.6543822 0.2132083 0.3325620 0.9910540 6 2 2 0.5698856 0.6154969 0.1569900 0.3370045 0.9175915
Changing prior distributions to \tau=1:
fit2 = crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time, prior=list(mu=rep(0,6), tau=1)),p=list(formula=~time, prior=list(mu=rep(0,6), tau=1))), burnin=5000, iter=50000)
Real parameter summary
fit2$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5653200 0.5638793 0.03237527 0.5006367 0.6271644 2 5 5 0.5581776 0.5599721 0.03350454 0.4946596 0.6258098 3 4 4 0.5543651 0.5534975 0.03469824 0.4839541 0.6193838 4 3 3 0.5129871 0.5143534 0.03648440 0.4418147 0.5840749 5 2 2 0.5135764 0.5177559 0.04252288 0.4340589 0.5998533 6 1 1 0.5762414 0.5766708 0.06972554 0.4398088 0.7116584
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.5869295 0.5912162 0.03911119 0.5135447 0.6670067 2 6 6 0.6131401 0.6162346 0.04041677 0.5362677 0.6940604 3 5 5 0.6119031 0.6182684 0.04223474 0.5341609 0.6992560 4 4 4 0.5974503 0.5972898 0.04822438 0.5052169 0.6946310 5 3 3 0.5960049 0.5939367 0.05625055 0.4836240 0.7040048 6 2 2 0.5867878 0.5843892 0.08063812 0.4282024 0.7412024
p is supposed to be around 0.9 for these data. Scary! Or maybe I'm missing something here?
Brett
On 4/15/2015 4:16 PM, Paul Conn wrote:
I think this might have to do with a common choice Devin makes for the priors on Beta - namely he sets them to
Beta ~ MVN(0,\tau (X'X)^{-1})
to scale them to the priors relative magnitude of the design matrix and improve mixing. I think this is something Jennifer Hoeting has advocated, if I remember Devin's explanation correctly. It's also a strategy I've adopted, so would be interested to hear what issues you've had with it!
-P
On Wed, Apr 15, 2015 at 4:02 PM, bmcclintock notifications@github.com wrote:
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin
should probably take a look and confirm.
You can view, comment on, or merge this pull request online at:
https://github.com/jlaake/marked/pull/4 Commit Summary
- fix bug in Gibbs sampler for betas
File Changes
- M marked/R/probitCJS.R https://github.com/jlaake/marked/pull/4/files#diff-0 (10)
Patch Links:
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93594868.
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
Yeah, not sure doing the scaling makes as much sense in latent variable applications when the design matrix potentially changes each iteration. I haven't looked at the code, but this means that updates for latent variables would also require evaluating the prior distribution for Beta in the full conditional...
Maybe in this case it does make the most sense to put in a constant \tau (or, maybe have ability to enter it in as a vector if there's multiple regression parameters).
On Wed, Apr 15, 2015 at 6:56 PM, bmcclintock notifications@github.com wrote:
I see that \tau=1 makes for a very informative prior using the \tau (X'X)^{-1} approach, but because X'X for this model depends on the latent variables (z) at each iteration, it seems it will be pretty tough for folks to specify priors for Beta.
Brett
On 4/15/2015 5:49 PM, Brett McClintock - NOAA Federal wrote:
Have you tried this approach with \tau > 0.01? I've been running probitCJS() with \tau = 1 (or 0.1), which seem like perfectly reasonable ''barely-informative'' priors for the probit link. With simulated data (\phi ~ 0.95, p ~ 0.4), I get horrifically lousy estimates.
Here it is for the good 'ol dipper data:
data(dipper)
Default prior distributions \tau=0.01:
fit1 =
crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)),
burnin=5000, iter=50000)
Real parameter summary
fit1$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5802428 0.6157544 0.1067098 0.4318847 0.8961070 2 5 5 0.5983627 0.6113757 0.1072963 0.4048039 0.9193108 3 4 4 0.6105205 0.6246322 0.1225496 0.4251695 0.9575122 4 3 3 0.4691240 0.5610643 0.1787757 0.3553289 0.9784923 5 2 2 0.4740408 0.6467389 0.2031592 0.3841401 0.9999156 6 1 1 0.9954080 0.8256072 0.1657278 0.5282918 1.0000000
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.9830198 0.8259658 0.1661635 0.3839609 0.9999998 2 6 6 0.9378322 0.8527325 0.1708688 0.3561641 0.9971572 3 5 5 0.9214020 0.8026059 0.1934767 0.3238298 0.9821723 4 4 4 0.9126367 0.7238430 0.2315205 0.2885504 0.9832404 5 3 3 0.8958076 0.6543822 0.2132083 0.3325620 0.9910540 6 2 2 0.5698856 0.6154969 0.1569900 0.3370045 0.9175915
Changing prior distributions to \tau=1:
fit2 =
crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time, prior=list(mu=rep(0,6), tau=1)),p=list(formula=~time, prior=list(mu=rep(0,6), tau=1))), burnin=5000, iter=50000)
Real parameter summary
fit2$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5653200 0.5638793 0.03237527 0.5006367 0.6271644 2 5 5 0.5581776 0.5599721 0.03350454 0.4946596 0.6258098 3 4 4 0.5543651 0.5534975 0.03469824 0.4839541 0.6193838 4 3 3 0.5129871 0.5143534 0.03648440 0.4418147 0.5840749 5 2 2 0.5135764 0.5177559 0.04252288 0.4340589 0.5998533 6 1 1 0.5762414 0.5766708 0.06972554 0.4398088 0.7116584
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.5869295 0.5912162 0.03911119 0.5135447 0.6670067 2 6 6 0.6131401 0.6162346 0.04041677 0.5362677 0.6940604 3 5 5 0.6119031 0.6182684 0.04223474 0.5341609 0.6992560 4 4 4 0.5974503 0.5972898 0.04822438 0.5052169 0.6946310 5 3 3 0.5960049 0.5939367 0.05625055 0.4836240 0.7040048 6 2 2 0.5867878 0.5843892 0.08063812 0.4282024 0.7412024
p is supposed to be around 0.9 for these data. Scary! Or maybe I'm missing something here?
Brett
On 4/15/2015 4:16 PM, Paul Conn wrote:
I think this might have to do with a common choice Devin makes for the priors on Beta - namely he sets them to
Beta ~ MVN(0,\tau (X'X)^{-1})
to scale them to the priors relative magnitude of the design matrix and improve mixing. I think this is something Jennifer Hoeting has advocated, if I remember Devin's explanation correctly. It's also a strategy I've adopted, so would be interested to hear what issues you've had with it!
-P
On Wed, Apr 15, 2015 at 4:02 PM, bmcclintock notifications@github.com wrote:
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin
should probably take a look and confirm.
You can view, comment on, or merge this pull request online at:
https://github.com/jlaake/marked/pull/4 Commit Summary
- fix bug in Gibbs sampler for betas
File Changes
- M marked/R/probitCJS.R https://github.com/jlaake/marked/pull/4/files#diff-0 (10)
Patch Links:
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93594868.
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93616175.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
Not sure if this will go through from my phone to everyone, but let's have a chat when I get back. I'm about to write up a Bayesian multistrata paper and we might brainstorm this out.
— Sent from Mailbox
On Thu, Apr 16, 2015 at 5:19 AM, Paul Conn notifications@github.com wrote:
Yeah, not sure doing the scaling makes as much sense in latent variable applications when the design matrix potentially changes each iteration. I haven't looked at the code, but this means that updates for latent variables would also require evaluating the prior distribution for Beta in the full conditional... Maybe in this case it does make the most sense to put in a constant \tau (or, maybe have ability to enter it in as a vector if there's multiple regression parameters). On Wed, Apr 15, 2015 at 6:56 PM, bmcclintock notifications@github.com wrote:
I see that \tau=1 makes for a very informative prior using the \tau (X'X)^{-1} approach, but because X'X for this model depends on the latent variables (z) at each iteration, it seems it will be pretty tough for folks to specify priors for Beta.
Brett
On 4/15/2015 5:49 PM, Brett McClintock - NOAA Federal wrote:
Have you tried this approach with \tau > 0.01? I've been running probitCJS() with \tau = 1 (or 0.1), which seem like perfectly reasonable ''barely-informative'' priors for the probit link. With simulated data (\phi ~ 0.95, p ~ 0.4), I get horrifically lousy estimates.
Here it is for the good 'ol dipper data:
data(dipper)
Default prior distributions \tau=0.01:
fit1 =
crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)),
burnin=5000, iter=50000)
Real parameter summary
fit1$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5802428 0.6157544 0.1067098 0.4318847 0.8961070 2 5 5 0.5983627 0.6113757 0.1072963 0.4048039 0.9193108 3 4 4 0.6105205 0.6246322 0.1225496 0.4251695 0.9575122 4 3 3 0.4691240 0.5610643 0.1787757 0.3553289 0.9784923 5 2 2 0.4740408 0.6467389 0.2031592 0.3841401 0.9999156 6 1 1 0.9954080 0.8256072 0.1657278 0.5282918 1.0000000
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.9830198 0.8259658 0.1661635 0.3839609 0.9999998 2 6 6 0.9378322 0.8527325 0.1708688 0.3561641 0.9971572 3 5 5 0.9214020 0.8026059 0.1934767 0.3238298 0.9821723 4 4 4 0.9126367 0.7238430 0.2315205 0.2885504 0.9832404 5 3 3 0.8958076 0.6543822 0.2132083 0.3325620 0.9910540 6 2 2 0.5698856 0.6154969 0.1569900 0.3370045 0.9175915
Changing prior distributions to \tau=1:
fit2 =
crm(dipper,model="probitCJS",model.parameters=list(Phi=list(formula=~time, prior=list(mu=rep(0,6), tau=1)),p=list(formula=~time, prior=list(mu=rep(0,6), tau=1))), burnin=5000, iter=50000)
Real parameter summary
fit2$results$reals $Phi time occ mode mean sd CI.lower CI.upper 1 6 6 0.5653200 0.5638793 0.03237527 0.5006367 0.6271644 2 5 5 0.5581776 0.5599721 0.03350454 0.4946596 0.6258098 3 4 4 0.5543651 0.5534975 0.03469824 0.4839541 0.6193838 4 3 3 0.5129871 0.5143534 0.03648440 0.4418147 0.5840749 5 2 2 0.5135764 0.5177559 0.04252288 0.4340589 0.5998533 6 1 1 0.5762414 0.5766708 0.06972554 0.4398088 0.7116584
$p time occ mode mean sd CI.lower CI.upper 1 7 7 0.5869295 0.5912162 0.03911119 0.5135447 0.6670067 2 6 6 0.6131401 0.6162346 0.04041677 0.5362677 0.6940604 3 5 5 0.6119031 0.6182684 0.04223474 0.5341609 0.6992560 4 4 4 0.5974503 0.5972898 0.04822438 0.5052169 0.6946310 5 3 3 0.5960049 0.5939367 0.05625055 0.4836240 0.7040048 6 2 2 0.5867878 0.5843892 0.08063812 0.4282024 0.7412024
p is supposed to be around 0.9 for these data. Scary! Or maybe I'm missing something here?
Brett
On 4/15/2015 4:16 PM, Paul Conn wrote:
I think this might have to do with a common choice Devin makes for the priors on Beta - namely he sets them to
Beta ~ MVN(0,\tau (X'X)^{-1})
to scale them to the priors relative magnitude of the design matrix and improve mixing. I think this is something Jennifer Hoeting has advocated, if I remember Devin's explanation correctly. It's also a strategy I've adopted, so would be interested to hear what issues you've had with it!
-P
On Wed, Apr 15, 2015 at 4:02 PM, bmcclintock notifications@github.com wrote:
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin
should probably take a look and confirm.
You can view, comment on, or merge this pull request online at:
https://github.com/jlaake/marked/pull/4 Commit Summary
- fix bug in Gibbs sampler for betas
File Changes
- M marked/R/probitCJS.R https://github.com/jlaake/marked/pull/4/files#diff-0 (10)
Patch Links:
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93594868.
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
Brett T. McClintock, Ph.D. NOAA National Marine Mammal Laboratory 7600 Sand Point Way NE Seattle, WA 98115 USA
PHONE: +1 206 526 4152 FAX: +1 206 526 6615
— Reply to this email directly or view it on GitHub https://github.com/jlaake/marked/pull/4#issuecomment-93616175.
Paul Conn Research Statistician Polar Ecosystems Program National Marine Mammal Lab NOAA Alaska Fisheries Science Center 7600 Sand Point Way, Building 4 Seattle, WA 98115
tel 206.526.4235
Reply to this email directly or view it on GitHub: https://github.com/jlaake/marked/pull/4#issuecomment-93761389
I noticed this after playing around with the beta prior distributions and finding extreme prior sensitivity (the bug is barely discernible using the default priors). I think the Gibbs updates are now correct, but Devin should probably take a look and confirm.