Extract or Identify the Department of Course Syllabus

Note: This is one of several issues related to basic information retrieval from the syllabi. We are assuming in all cases that the extraction is from a .txt document.

Task: Given a syllabus in .txt format, identify and extract the department of the course. This involves two possible approaches:

1) If the department is actually stated in the syllabus, extract it 2) If the department is not stated in the syllabus, use machine learning approaches (such as Naive Bayes, TDIDF, or Topic Modeling) to determine the likely department based on the text of the syllabus (e.g., the course description, the books mentioned, etc.).

xpmethod / opensyllabus

Extract or Identify the Department of Course Syllabus #29