point8 / data-science-learning-paths

Practical data science courses - from basic to intermediate
Other
23 stars 3 forks source link

🤖Review and expand DataFrame operations section in "Data Handling with Pandas" notebook #16

Open clstaudt opened 4 months ago

clstaudt commented 4 months ago

Issue Description

The notebook introduces basic DataFrame operations but can be expanded to showcase a wider range of common manipulations, including handling missing data and more complex filtering.

Examples

The notebook could include examples of:

Proposed Change

Example Implementation

# Handling missing data
df_cleaned = df.dropna()  # Drops rows with any missing values
df_filled = df.fillna(method='ffill')  # Forward-fill missing values

# Complex boolean indexing

high_quality_red = df[(df['quality'] > 7) & (df['color'] == 'red')]

# Using .query() for filtering
high_quality_red_query = df.query("quality > 7 and color == 'red'")

# Applying a function with .apply()
df['quality_label'] = df['quality'].apply(lambda x: 'high' if x > 7 else 'low')