UBC-MDS / programming-in-python-for-data-science

https://prog-learn.mds.ubc.ca/
Other
20 stars 22 forks source link

RoadMap Module 4 #31

Closed hfboyce closed 4 years ago

hfboyce commented 4 years ago

Organized by exercise (will have practice problems inbetween numbers )

  1. Python data types (float/int/bool/string/NoneType)

    • Talk about how we've seen numbers (int and floats) and how we put words and characters in quotations (single, double, 3 single quotes) which are strings. Also talk about True/False = boolean
    • NaN in dataframe-> What are these?
    • Introduce type() function to identify them
    • Introduce .lower() , len()
    • casting
  2. Python Data Structures: Lists and Tuples and Sets

    • Explain str.split() segue into lists (from output)
    • append(), values in a list ( lists can have lists as entries etc)
    • Explain what immutable means
    • Compare and contrast
    • Create a dataframe from list
  3. Python Data Structures: Dictionaries

    • Define dictionary
    • Use actual dictionary as analogy (word- definition == Key- value)
    • Features, how to obtain key and value
    • Create a dataframe from dictionary (the json idea is in the assignment here)
  4. Dataframe/column type and dtypes

    • Start by taking type of a data frame
    • Take type of a column = panda series
    • double square brackets vs signle square brackets
    • Dataframe = made up of panda series
    • Introduce column dtypes
    • dtypes function
    • What's in each cell for column dtypes.
    • Diagram
  5. Operations with different data types

    • Different operators
    • Comparison Operators with booleans
    • What happens when we add strings?
    • What happens when we divide 2 ints?
    • Can we add different types together?
    • What operations work on each?
    • Can this logic be carried over to operations with columns types? segue to next section
  6. Operations with Columns

    • Splitting up a column of dtype object
    • default column dtype
    • .astype()
    • taking mean, sum, max, min of column dtypes (what happens?) (.mean(axis=1) vs .mean(axis=0))
    • changing column dtypes
    • dtype argument in pd.read_csv()