CredentialEngine / ai-course-crawler

Apache License 2.0
1 stars 0 forks source link

Credit Unit Type assumed to be Semester Credit when course URLs do not indicate the credit type #31

Closed rvilsack closed 2 months ago

rvilsack commented 3 months ago

Instead of documenting this issue with each course extract/test, we wanted to flag this as a seperate issue.

We're noticing that the Credit Unit Type being assigned to courses with credit value are overwhelmingly Semester Credit even if there is nowhere on the course URLs that indicate the credits are on a semester basis.

Here is an example:

Thomas Edison Community College URL tested: https://www2.tesu.edu/listall.php Crawler extraction link: https://github.com/CredentialEngine/ai-course-crawler/issues/27 BU spreadsheet: https://docs.google.com/spreadsheets/d/1TEwPDD3xVGZrNi1G-9-KqS9muhox9gkCEMg9bjWvIy4/edit?usp=drive_link

See row 2 here. The credit value (3) is correctly captured. But nothing on this page suggests a semerster credit value:

Screenshot 2024-08-20 102801

While it may be assumed that a college runs on a semester schedule, the model could be trained to not make this assumption and instead use the Credit Unit Type Description term. So the header rows (values) would be:

Credit Unit Value: 3 Credit Unit Max Value Credit Unit Type Credit Unit Type Description: This has credit value, but the type cannot be determined.

Credit Unit Type is a controlled variable, so is the model defaulting to Semester Credit for any credit on a college course URL?

rsaksida commented 2 months ago

Thank you, Rachel. I'm closing this to consolidate the issues into #39.