todogroup / osposurvey

Open Source Programs (OSPO) Survey
https://todogroup.org
Creative Commons Attribution Share Alike 4.0 International
71 stars 26 forks source link

Add start of data analysis in jupyter #93

Closed marwahaha closed 2 years ago

marwahaha commented 3 years ago

I attended a Linux Foundation conference where this data was presented. One of the presenters suggested that a Jupyter notebook might be a nice way to explore the data. Here's a start -- what do you think?

https://nbviewer.jupyter.org/github/marwahaha/osposurvey/blob/jupyter-analysis/2021/data_analysis.ipynb

LawrenceHecht commented 2 years ago

I am a Juypter novice and can't follow all the code you wrote. It is easy for me to copy-paste, but very slow-going to manipulate things quickly ;)

The sample size didn't match up on one of your charts versus the PDF because most of the report is based filtered based on Sample Qualification = Multiple_employees.

marwahaha commented 2 years ago

@LawrenceHecht - if you like, I can help you walk through the code. Book some time online with me here: https://calendly.com/marwahaha/office-hours

I also added some comments to help you, and filtered the report based on 2+ employees.

LawrenceHecht commented 2 years ago

awesome start. I set scheduled some time. The ideal situation is that the notebook becomes the place where "production" analysis happens. I understand that that it can help with quality control.