Ironhack-data-bcn-oct-2023 / project-IV-sql-tableau

0 stars 0 forks source link

[Noe] - Project-IV #4

Open niniet98 opened 1 year ago

niniet98 commented 1 year ago

https://github.com/niniet98/PROJECT-IV

sahernandezr commented 11 months ago

🐬📊Congrats on your project!

README Your README is well organized. I would suggest you add a little bit more descriptive information in your Data Source section. You start the Hypothesis section talking about differences among states but I still don’t know which country you are dealing with. So you could add something like Data Science Job Posting on Glassdoor sourced from Kaggle, that includes information from over 600 job postings from more than 400 companies in the United States.

The way you present your Analysis is clear and engaging. Including the visualizations right along your analysis is a great idea.

Code You could improve your .py by adding a docstring to the clean function, just as you did with the standarize_w_nulls function.

SQL You created great queries to answer your questions and there is a lot of opportunity to extend this analysis: maybe including the industry on your different queries to see their needs differ or adding some kind of data regarding cost of living from each state to see if the salaries are up to par.

Tableau I am not clear on what you were trying to show with the Percentage difference between the average wage and the maximum wage by state above 100%. Because the average is affected by the high values at the top, so maybe you should be using the median of each state instead of the average. It is also very important to be clear about the units of measure you are using on your visualization. Is this yearly salary? Measured in thousands of US dollars? Your graphs should be completely clear on their own.

In your second story point, Skills impact on Salary, you are showing how many job postings for each type of job are asking for the different skills, but that is not enough information to talk about each skill impact on salary. What you can say is that almost all of the data scientists or data engineers need to know Python, but if you are looking for a director job, technical skills like Excel, Python or Tableau are not highlighted as relevant. And if you are thinking of being a manager, you may need to dust off your skills on Tableau because that type of job is the one that mentions this skill the most, followed by the job postings looking for analysts.

In your third story point, the treemap is not the best visualization to show this info. You have too many industries so the colors are used several times and several rectangles are almost the same size. In this case, a bar plot could show the average wage and allow for an easier comparison. (Some tips from Tableau for treemaps here)

From what I know of Glassdoor, the rating tells you how each company is graded, so on your bar plot here we can see that lowest rated companies offer higher average salaries. That could be interesting to look more deeply: lowest rated companies need to offer higher pay to attract talent in a competitive market of company looking for data specialists.