Open linharesjunior opened 8 months ago
You did great on this project!
Repository You have an organized repository with files named clearly. I would only suggest you to create a notebooks folder for all your Jupyter Notebooks and leave the src only for .py files that contain the functions you create for modularization/encapsulation.
Acquisition and enrichment of database Your decision to use wage data to analyze alongside the rent data gives a very nice insight on the situation from not only the real estate market but the actual economic impact of rising rents and stagnant wages.
But the source of the data you use as a data analyst is very important. In this case, you scrapped the countryeconomy.com website, which doesn’t specify the source of their data. Is it official? Self-calculated? We don’t know. It is always a good idea to consult official sources, in this case the INE’s Encuesta Anual de Estructura Salarial (https://www.ine.es/jaxiT3/Datos.htm?t=28187) could be a good first step.
And, since you are working with local data, it could be even better to search for wage data specific to Barcelona, because the wage trends could be different from the national trends. Here is data from 2021 (https://ajuntament.barcelona.cat/barcelonaeconomia/sites/default/files/Salaris_2021.pdf) that says that the gross annual salary of Barcelona’s residents was 32,324 euros in 2021, above both the INE’s data (25,896 euros) and countryeconomy.com’s data (27,570 euros).
On your Canva slides, you use a line graph to show correlation between average salary over the years and rent increase, but a better choice of graph to show the relationship between two variables is a scatterplot.
And a way to make your README more attractive is to include your visualizations right along with your conclusions, that way the reader can see both at the same time. Here is a tutorial to do that: https://www.digitalocean.com/community/tutorials/markdown-markdown-images
I really liked how you write your conclusions. The text is clear and has a good mix of text and data to justify what you are saying. I would only try to separate your sentences so you have more than one paragraph. Big blocks of continuous text are hard to read and can discourage some people.
https://github.com/linharesjunior/Project-Data-Pipeline.git