mid bootcamp project

Hello hello Charlotte 🙋🏻‍♀️ , here we go with the revision of your project

README

As we told you in the presentation you have a perfect readme 🔥💪. I just have to tell you two little thing related to libraries you have used:

You don't need to put the .py file where you have your functions.
If you put the links to the documentation it would be perfect!

How can we put those links in markdown?
```
[pandas](https://pandas.pydata.org/docs/)
```

Repo strucutre

Let's go with this part, I'll try to give you some tips to make the repo as clean as possible.

Your repo has many files, in this case the ideal would be to create different folders where we will save the different files.
- We can create a folder for the jupyters that we can call Notebooks. If you already number the files to know the working order it would be perfect!
- The .py file has to go in a folder called src (not that we are crazy maniacs, it is more of a convention 🤣).
- You have multiple text files where you make a detailed description of each of the phases of the job, put them all in a single folder.
Very good that you didn't miss any rare files outside of the gitignore.

Good job in this part Charlotte!!

Code Syntax

Let's go with the code!

Solutions SQL- Classification
- I see that at the beginning of this file you use a function that you have created and that you have in the helper_classification.py file but I don't see that you have imported those functions into jupyter.
  
  How can we "bring" the functions from the .py to the jupyter?
  
  We have to put the jupyter
```
import src.helper_classification as hc # (the alias that you put can be the one that we want) 
```
  Once we have this, we can access to each of the functions that we have in that file. How do we call the functions now?
  
  We use the alias that we gave him and the name of the function that we want to use:
```
hc.get_started
```
- Try to put all the imports at the beginning of the jupyter.
Nothing more to say about this part Charlotte, the truth is that you have it perfect, you have even interpreted the results, super 👏🏽!
Solutions_Python - Classification

First of all, you have made it very difficult for me to look for areas for improvement in your project because it is practically perfect. Let's go into some details, but as I say, I'm just being picky.

Regarding the KNN model, when you choose the best value of k you did it perfect. As a detail, in pyhton we have the KElbowVisualizer method which allow us to select optimal number of cluster in a simple way by fitting the model with a range of values.

Here some documentation.

# here an example of what the code would look like

# set the model
model = KMeans()

#inizialice the Visuaized. k correspond with the range of k we want to test 
visualizer = KElbowVisualizer(model, k=(2,15), metric='silhouette')

# fit the model for all the k created in the previous step
visualizer.fit(X)  

# return a plot highlighting the optimum number of k     
visualizer.show()

TODOs

Here, a little recap about some tips:

Libraries documentation links
Elbow functions
Using .py file in the jupyter

Well Charlotte, you have done an impeccable job, very well documented, with each step explained, with interpretations of the results. A great exploration of the data etc.

I can only congratulate you 👏🏽

Ironhack-Data-0621-Remote / mid-bootcamp-project