almsam / data-analysis-project-remastered

a revision of my old data science project
0 stars 0 forks source link


Logo

Data Analysis on the GAYBORHOODS dataset

During my second year of university, I was tasked with completing a data analysis project on a dataset of my choosing. This included data cleaning, exploratory data analysis, data visualization, & presentation. I have since added a wide variety of skills to my toolkit as a data scientist - such as constructing advanced linear regression models, classification models, decision trees, & simple neural networks. So I have decided to utilize these tools to revisit this dataset to see if I can find insight I missed the first time.

About The Project

The project is going to be built in python, & focus on Pandas/Seaborn for data frames & visualization, as well as StatsModels & SciKit-Learn for the ML models, as well as Numpy & MatPlotLib as pre req for the others

Built With: Py Using Pandas Seaborn, In addition to Statsmodels-url & Scikit-learn for ML

Original Project by Sami Almuallim & Nat Scott, With Py Using Pandas & Seaborn

(back to top)

At a glance

The project has 2 main outcomes: some conclusions in the form of thesis statements about trends in the data, & some visualizations to show my conclusion & the final itteration of my analysis

Such as my conclusion of the first section that queer peoples tax rates in relation to straight peoples is most strongly determiend by which city they live in:

Part 1: Tax Analysis

Or my conclusion of the second that the magnitude of queer communities is also best run in relation to which city you're in:

Part 2: Bars & Parades

Or my insights into the political alignment & Local of queer communities being different for Kinsey index & GAYBORHOODS TOTINDEX defined queer community centers & districts:

Part 3: Political allignment

Or my final analysis of each cities geography - specifically the locations of queer communities (or GAYBORHOODS):

Part 4: Geographics not Demographics

(back to top)

Contact

Sami Almuallim - samialmuallim@gmail.com

Project Link: https://github.com/almsam/data-analysis-project-revised

(back to top)