udacity / AIPND

Code and associated files for the AI Programming with Python Nanodegree Program
MIT License
634 stars 815 forks source link

Missing Matplotlib Dataset #8

Closed ScriptAutomate closed 3 years ago

ScriptAutomate commented 5 years ago

A file is missing from the Matplotlib / data directory. A dataset is consistently used in the lessons that includes "Alpha","Beta", "Gamma", etc. values.

In order to properly do those exercises, I had to randomly generate Grade data as a substitute:

# Back to auto-generated data for the following steps
student_grades = []
for student in range(30):
  student_grades.append(random.choice(['A','B','C','D','F']))
grades = pd.DataFrame({'Grades': pd.Series(student_grades)})
grades.head()

I had to replace all instances of df in his example code with grades, and replaced all instances of cat_var with Grades. His examples in messing with the resulting bar graphs should apply in the same way with grades as he is doing with Alpha, Beta, etc.

This worked properly with examples, such as categorical ordering:

# With grades, or categorical / ordinal data that is ranked by type,
# we may care about an explicit order. Ordering it up!
level_order = ['A', 'B', 'C', 'D', 'F']
ordered_cat = pd.api.types.CategoricalDtype(ordered = True, categories = level_order)
grades['Grades'] = grades['Grades'].astype(ordered_cat)
# Display graph
base_color = sb.color_palette()[0]
sb.countplot(data = grades, x = 'Grades', color = base_color);

I have added more detailed information as an answer to a question in the Udacity Knowledge Base, where someone ran into the same problem, accessible behind an authenticated page: How to practice examples (matplotlib/seaborn) with no .csv file available for download??

abhiojha8 commented 3 years ago

Though we are late to respond, can you share which lesson or exercise had that dataset (with alpha, beta, gamma, etc.) values?

Since that dataset is not used in the examples shared in this GitHub repo, it's not included here.

ronny-udacity commented 3 years ago

Confirming @abhiojha8 comment about the datasets used in the matplotlib exercises.