best-practice-and-impact / ons-spark

MIT License
9 stars 5 forks source link

Logistic regression #84

Closed emercado4 closed 1 year ago

emercado4 commented 1 year ago

Pre-merge request checklist (to be completed by the one making the request):

Details of this request:

New section on logistic regression with guidance on how this can be done in PySpark and sparklyr, issues to watch out for and how to access regression coefficients, confidence intervals etc.

Things to note about this request:

Requirements for review :

ChrisSoderberg-ONS commented 1 year ago

The logistic regression link hasn't been added to the table of contents file

emercado4 commented 1 year ago

The logistic regression link hasn't been added to the table of contents file

Thanks for spotting - I've now added it to the Spark functions section but it might be moved to a new analysis section later.

ChrisSoderberg-ONS commented 1 year ago

I've checked the code again and confirmed that it works as expected, also added a converter script that turns this markdown file into a python notebook and r script so it can be run by others. I have verified that the book builds correctly as well

emercado4 commented 1 year ago

Thanks so much for checking all this Chris! I've also now opened an issue (#104) for adding the cleaned rescue dataset to config.yaml so that the data reading process can be made consistent with the rest of the book.