NathanielF commented 1 year ago

Notebook proposal

Bayesian Methods fo reliability Data

This topic crops up in engineering quite a bit and generally involves a spin on survival analysis (time to failure) type models, but it's also tightly linked with the notion of cost-benefit, since there is a tolerable degree of failure. I was thinking about riffing on a discussion about prediction intervals for failure analysis in:

https://www.wiley.com/en-us/Statistical+Methods+for+Reliability+Data,+2nd+Edition-p-9781118115459

and contrast the frequentist and Bayesian approach to calibrated prediction. It might be interesting to contrast conformal prediction methods too, since it is a frequentist approach to uncertainty quantification which is getting quite a bit of coverage lately.

Suggested categories:

Level: Intermediate
Diataxis type: explanation

Related notebooks

It might have cross over in themes with: https://github.com/pymc-devs/pymc-examples/issues/407 and some of the techniques for modelling the failure predictions draw on survival analysis models.

OriolAbril commented 1 year ago

I think it would be a great addition, and it is a great fit for an explanation type notebook (that can/could complement other how-to notebooks on generating predictions for different kind of models for example).

NathanielF commented 1 year ago

Did a little work on this over the Christmas holidays. Written up a brief discussion of nonparametric and parametric estimation of reliability distributions and their CDFs. I'm following an example in the book cited above which moves through the frequentist MLE style estimation to a data set with very few failures... it argues for the importance of Bayesian modelling in this case especially.... so I think it's a good example for here.

However, I'm stuck a bit on trying to replicate the model fits achieved in the book or bring them anywhere close to the MLE fits. I'm using a Weibull survival model and I've tried (a) to replicate the stan model they use in the book (b) just use a base pymc weibull fit adding a potential for lcdf portion of the log likelihood and (c) use the censored model transformation as discussed in one of the survival analysis notebooks by @drbenvincent . I've use both period form and item-period data sets as described in the notebook, but none seem to recover good model fits with the pymc implementation...

I reckon I must be doing something silly so would appreciate any pointers you might have.

NathanielF commented 1 year ago

Think the above discussed issue is resolved now:

Now just need to write up the contrasting perpsectives on prediction intervals.

NathanielF commented 1 year ago

Ok, i've been banging my head against pre-commit issues on this pull request for a while now. I don't really understand it either since the tests appear to be passing for me locally:

But when they run in the cloud it breaks on the jupytext test where it says the myst notebook git index is out of date...

any idea why this might be happening @OriolAbril?

NathanielF commented 1 year ago

Some more context on the error. It crashes out with the message here:

But should the paired notebook not appear in the myst nb directory rather than the examples/case_studies directory?

NathanielF commented 1 year ago

I suspect the issue has something to do with the metadata like discussed here: https://github.com/mwouts/jupytext/issues/900

My version of jupytext:

Couldn't easily find which version of juyptext the runner is using?

NathanielF commented 1 year ago

Ok, fixed the git index issue. Finally realised there was no more msyt notebooks directory on main. But now the docs won't build because it's failing on the watermark extension:

NathanielF commented 1 year ago

Think some of these issues are due to the fact that the Head of the main branch does not reflect the actual latest commit where @drbenvincent has added the template notebook and the myst directory has been removed.

I mean, i have no idea why the watermark extension is failing in sphinx, but there is a break in the directory structure and the jupytext configuration that is not currently being read as the most updated branch.

NathanielF commented 1 year ago

Ok, to summarise my findings so far:

When i had the old configuration the sphinx docs would build successfully but the pre-commit checks would fail because jupytext expected a note book in the examples/ directory.

The pre-commit checks fail unless there is a myst notebook in the examples/ folder. The old jupytext specification added the myst notebook to the myst directory. Changing the jupytext configuration the one visible on github seems to solve the issue of pre-commit checks and populates the notebook in the examples directory. However, now the sphinx doc build started to fail with the %load extension error. I don't understand why this would be the case but assume it has something to do with the mismatch between commits?

Going to try branching off the latest commit and start a new pull request

NathanielF commented 1 year ago

Branching off the latest commit seemed to work to resolve the pre-commit issues!!!

OriolAbril commented 1 year ago

Sorry for not seeing this before. Yeah, we recently changed the way pre-commit and jupytext worked because it was often a source of conflicts. It should now be simpler in that running pre-commit will always fix the jupytext step (like black for example), but you have needed to rebase and get rid of the myst_nbs notebook and of the doc build folder (if you built the docs locally).

NathanielF commented 1 year ago

All sorted now. Thanks!

NathanielF commented 1 year ago

New pull request with passing commit checks is ready for review here:https://github.com/pymc-devs/pymc-examples/pull/491

pymc-devs / pymc-examples

Bayesian Methods for Reliability Data #474

Notebook proposal

Related notebooks