My setup uses Pandas 0.24.2, so this might not be a problem in other environments.
Ran into the following traceback when loading when executing my rewritten fetch_start_end_date function (this should probably be in an alternative issue, but the original function causes a file not found error because it isn't looking the /input/course/session directory):
def fetch_start_end_date(course, session_dir, date_csv="coursera_course_dates.csv"):
"""
Fetch course start end end date (so user does not have to specify them directly).
:param course: course name.
:param session_dir: input directory.
:param date_csv: Path to csv of course start/end dates.
:return: tuple of datetime objects (course_start, course_end)
"""
date_df = pd.read_csv(
"{}{}".format(session_dir, date_csv),
error_bad_lines=False
).set_index("course")
course_start = datetime.strptime(date_df.loc[course].start_date, "%m/%d/%y")
course_end = datetime.strptime(date_df.loc[course].end_date, "%m/%d/%y")
return course_start, course_end
Due to some encoding problems, this returns the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x97 in position 38: invalid start byte
I fixed this locally by changing the pd.read_csv() call to include:
My setup uses Pandas 0.24.2, so this might not be a problem in other environments.
Ran into the following traceback when loading when executing my rewritten fetch_start_end_date function (this should probably be in an alternative issue, but the original function causes a file not found error because it isn't looking the /input/course/session directory):
Due to some encoding problems, this returns the following error:
I fixed this locally by changing the
pd.read_csv()
call to include: