Ironhack-data-bcn-january-2024 / project-I-pandas

0 stars 2 forks source link

PRoject 1 #2

Closed Nikolas121205 closed 7 months ago

Nikolas121205 commented 7 months ago

https://github.com/Nikolas121205/Project

breogann commented 7 months ago

Hi Niko! 🙋🏻‍♂️

(if you read this from github, you'll see the code formatted)

All of the things that I'll comment are things that I would change to polish the project a little bit more.

In this function:

def clean_colnames(df):
    '''Clean columns names when passing a pandas dataframe: params (df - dataframe)'''
    col_clean = []
    for col in df.columns:
        col = col.strip().lower()
        col = col.replace('.',' ')
        col_clean.append(col)

    df.columns = col_clean
    return df.columns

great that you put it into a function. You can refactor ir by doing:

def clean_colnames(df):
    '''Clean columns names when passing a pandas dataframe: params (df - dataframe)'''
    col_clean = []
    for col in df.columns:
        col = col.strip().lower()
        col = col.replace('.',' ')
        col_clean.append(col)

    return df.col_clean

 df.columns = clean_colnames(df)

instead of re-assigning inside of the function.

For this function:

def country_hem(col):
    hem_lst = []
    for i in col:
        try:
            country = CountryInfo(row)
        except AttributeError:
            row = 'x'

        try:
            pos = country.latlng()
        except KeyError:
            pos = (0,0)

        if pos[0] > 0:
            temp_row = 1
        elif pos[0] < 0:
            temp_row = 0
        else:
            temp_row = np.nan

        hem_lst.append(temp_row)

    return hem_lst   

did you do an except AttributeError because you had missing values? I do find it to be a nice workaround, honestly. Another thing you could to is transform the nulls into something else, just in case. Or maybe drop them if you think it makes sense for the rest of the processing.

In terms of plots: sort them either by count or alphabetically, as we mentioned in the presentation. And the rotation of the axis to increase readability. Also, try to avoid this by including a semicolon at the end of the plotting line:

Screenshot 2024-02-01 at 20 27 26

Anyways, all these things are simple things to change formally, but that functionally are okay. Meaning: the project is good, you can just do some small adjustments to make it a bit better.

Do include a readme.md though!!! That is not jsut a detail.

Good job though!

https://user-images.githubusercontent.com/54676992/106113904-4271d880-614f-11eb-9326-6a06e40aaf3e.png