Edu Project II - Githubissues

You did great on this project! Some comments:

By talking about Ronaldo’s performance and sun in the zodiac sign, you need to be very careful. Remember that correlation is not causation.

In your presentation, question 2 is “Did astrology have more or less of an effect on his most important matches for Real Madrid?”, and you can’t determine that. You can only say if his performance was better or worse when the sun was on or near his sign or not. Same with question 3.

I was also left with the question of why is the sun in the sign important? Astrologically speaking. It would be a good thing to add an explanation to your project.

Repository You have a good folder structure and repository.

On your data folder you have two files called CR7, one .zip and one .csv. It is best to have only one version of a file (the .csv on this case) or, if the files contain different information, make it clear with informative file names, so the readers can have an idea of what they’ll find inside each file.

Same thing with your image files, instead of figure1.png, figure2.png, it would be better to name the files using your graph’s titles. (“total_goals_ronaldo_real_madrid.png”, for example)

Acquisition and enrichment of database As you saw, sometimes the first plan doesn’t work, but you did great by finding a way to replace de API’s information with scraping Wikipedia to enrich your dataset.

I would suggest you divide your code in several notebooks: one for scrapping, another for cleaning, other for visualizations and one for creating your report, instead of having everything on notebook_clean.ipynb

Reporting You have your analysis and conclusions mainly on your README, but it is important that you include that too in the notebook that will work as your final report, in this case it could be the notebook_clean.ipynb or you could even create a notebook just for the final report, where you can develop your storytelling. You have the opportunity to write more in depth analysis and conclusions there and just keep the most attractive points in the README.
README You have a very clear and informative README. I particularly like your Limitations section. It is always a great idea to recognize what are the areas of opportunity and limitations of a project.
Bonus: modularization/encapsulation I see that you created all your .py files with your functions. The next step would be to import them to your notebook (or several notebooks if you separate the different stages of your project: scraping, cleaning, etc). Then you can modify your notebooks to call those modules to use the functions.

A recommendation of best code practices is that a function should complete only one task, not multiple tasks. So, on your visualization.py, it would be better if you create one function for each visualization, instead of using one function to make all of your graphs.

Maybe you just want to update one graph, but by having all the code inside one function, you can’t do it.

Ironhack-data-bcn-oct-2023 / project-II-pipelines

Edu Project II #10