UBC-MDS / noaastn

Python package that downloads, processes and visualizes weather data from the NOAA website.
https://noaastn.readthedocs.io/en/latest/
MIT License
3 stars 3 forks source link

Package Review Suggestion For plot_weather_data #46

Open ChadNeald opened 3 years ago

ChadNeald commented 3 years ago
def plot_weather_data(obs_df, col_name, time_basis):
    """
    Visualizes the weather station observations including air temperature,
    atmospheric pressure, wind speed, and wind direction changing over time.
    Parameters
    ----------
    obs_df : pandas.DataFrame
        A dataframe that contains a time series of weather station
        observations.
    col_name : str
        Variables that users would like to plot on a timely basis,
        including 'air_temp', 'atm_press', 'wind_spd', 'wind_dir'
    time_basis : str
        The users can choose to plot the observations on 'monthly' or
        'daily basis'
    Returns
    -------
    altair.vegalite.v4.api.Chart
        A plot can visualize the changing of observation on the timely basis
        that user chooses.
    Examples
    --------
    >>> plot_weather_data(obs_df, col_name="air_temp", time_basis="monthly")
    """

    # Test input types
    assert (
        type(obs_df) == pd.core.frame.DataFrame
    ), "Weather data should be a Pandas DataFrame."
    assert type(col_name) == str, "Variable name must be entered as a string"
    assert type(time_basis) == str, "Time basis must be entered as a string"
    # Test edge cases
    assert col_name in [
        "air_temp",
        "atm_press",
        "wind_spd",
        "wind_dir",
    ], "Variable can only be one of air_temp, atm_press, wind_spd or wind_dir"
    assert time_basis in [
        "monthly",
        "daily",
    ], "Time basis can only be monthly or daily"

    df = obs_df.dropna()
    assert (
        len(df.index) > 2
    ), "Dataset is not sufficient to visualize"  # Test edge cases
    year = df.datetime.dt.year[0]

    title_dic = {"air_temp": "Air Temperature",
                 "atm_press": "Atmospheric Pressure",
                 "wind_spd": "Wind Speed",
                 "wind_dir": "Wind Direction"}

    if time_basis == "monthly":
        df = df.set_index("datetime").resample("M").mean().reset_index()
        assert (
            len(df.index) > 2
        ), "Dataset is not sufficient to visualize"  # Test edge cases

        line = (
                alt.Chart(df, title= title_dic[col_name] + " for " + str(year))
                .mark_line(color="orange")
                .encode(
                    alt.X(
                        "month(datetime)",
                        title="Month",
                        axis=alt.Axis(labelAngle=-30),
                    ),
                    alt.Y(
                        col_name,
                        title=title_dic[col_name],
                        scale=alt.Scale(zero=False),
                    ),
                    alt.Tooltip(col_name),
                )
            )        

    else:
        df = df.set_index("datetime").resample("D").mean().reset_index()
        assert (
            len(df.index) > 2
        ), "Dataset is not sufficient to visualize"  # Test edge cases

        line = (
                alt.Chart(df, title= title_dic[col_name] + " for " + str(year))
                .mark_line(color="orange")
                .encode(
                    alt.X(
                        "datetime", title="Date", axis=alt.Axis(labelAngle=-30)
                    ),
                    alt.Y(
                        col_name,
                        title=title_dic[col_name],
                        scale=alt.Scale(zero=False),
                    ),
                    alt.Tooltip(col_name),
                )
            )

    chart = (
        line.properties(width=500, height=350)
        .configure_axis(labelFontSize=15, titleFontSize=20, grid=False)
        .configure_title(fontSize=25)
    )

    return chart
ChadNeald commented 3 years ago

A package review suggestion on how the plot_weather_data() function could be written in a more DRY style. The code above mainly updates the title variable and the y axis variable for each plot so that they don't have to be hardcoded every time.

chenzhao2020 commented 3 years ago

Hello @ChadNeald thank you so much for your suggestions. I'll modify the plot_weather_data() function based on your help above. Will keep you updated about the progress. Thanks a lot!

chenzhao2020 commented 3 years ago

Thank you so much for the suggestion @ChadNeald ! I have fixed the repeated code of plot_weather_data() function based on your instruction. Much appreciate it! I will close this issue with the next merge of pr.