EZCampusDevs / searchIndex

Database Indexing Search API written with Jakarta, Hibernate ORM connecting to MySQL and runs on GlassFish7
GNU General Public License v3.0
0 stars 2 forks source link

Program Maps onto the Database | Implementation & Discussion #10

Open jasonmzx opened 1 year ago

jasonmzx commented 1 year ago

Objectives

  1. Get the Program Maps up onto the database by creating the required tables to hold this data. (View Database Implementation notes below)

  2. Write pipeline that writes the JSON of Program Maps, and Abstracted Program map JSON files onto the DB

Getting to know the data... SchedulePlatform/tree/main/backend/app/schedulizer/pmaps in our Archived SchedulePlatform repository, you'll see all the JSON files

Here are the 2 main types of JSON files we'll need to account for:

This is A Regular Program Map

{
    "is_abstracted": false,
    "school": "OTU",
    "name": "Electrical Engineering Y3S1",
    "manifest_list": [
        "ELEE3110U",
        "ELEE3230U",
        "ELEE3240U",
        "ELEE3250U",
        "ELEE3450U"
    ],
    "year": 3,
    "semester": 1,
    "semester_keyword": "Fall",
    "semester_count": 5,
    "total_years": 4,
    "total_semesters_count": 8,
    "category": "Engineering",
    "description": "Electrical Engineering - third year, first semester."
}

This is an Abstracted Program Map

{
    "is_abstracted": true,
    "school": "OTU",
    "name": "ALL Third Year Engineering Y3S1-2",
    "manifest_list": [
        "Automotive Engineering Y3S1",
        "Automotive Engineering Y3S2",
        "Electrical Engineering Y3S1",
        "Electrical Engineering Y3S2",
        "Electrical Engineering: Smart Grid Specialization Y3S1",
        "Electrical Engineering: Smart Grid Specialization Y3S2",
        "Manufacturing Engineering Y3S1",
        "Manufacturing Engineering Y3S2",
        "Mechanical Engineering Energy Option Y3S1",
        "Mechanical Engineering Energy Option Y3S2",
        "Mechanical Engineering Y3S1",
        "Mechanical Engineering Y3S2",
        "Mechatronics Engineering Y3S1 (2021 Entry and later)",
        "Mechatronics Engineering Y3S2 (2021 Entry and later)",
        "Nuclear Engineering Y3S1",
        "Nuclear Engineering Y3S2",
        "Software Engineering IOT Specialization Y3S1",
        "Software Engineering IOT Specialization Y3S2",
        "Software Engineering Y3S1",
        "Software Engineering Y3S2"
    ],
    "semester_keyword": "Fall, Winter",
    "year": 3,
    "category": "(ALL CUMULATIVE) Engineering",
    "description": "(ALL CUMULATIVE) Engineering Programs - third year, first semester, second semester."
}

Database Implementation:

image

Tables added:

tbl_pmaps_abs_compositon linking Program Maps with Abstracted Program Maps tbl_abstract_program_map with Foreign Key Relationship with tbl_pmaps_abs_compositon tbl_program_map with Foreign Key Relationships with tbl_school.school_id , tbl_pmaps_abs_compositon & tbl_pmap_course tbl_pmap_course with Foreign Key Relationships with tbl_course and tbl_program_map

This was quick mockup based on the JSON files we've got in the old repository, so it's still up for discussing before we being our implementation.

General Notes for Pipeline Implementation

We'll need to load the Regular Program maps first, the ones with "is_abstracted" : false, then once those are loaded, we can populate the second table for abstracted program maps...

image

STEP 1 parse all JSON files without prefix ALL_ , (This function should include checking the manifest_list against all tbl_course.course_code entries, so that we can accurately map that relationship upon record creation) *Population of tbl_program_map , tbl_pmap_course )

STEP 2 parse the ones with the ALL_ prefix, and populate the tbl_abstract_program_map , and match the relations to the tbl_pmap_course table.


Scope of Issue

Minnowo commented 1 year ago

I don't really see much difference between the abstracted and normal program maps, could we not just have a table tbl_pmap which has a boolean column for is_abstract? Leaving anything not shared nullable.

Minnowo commented 1 year ago

I don't really understand what tbl_pmaps_abs_compositon does, or what tbl_abstract_program_map is doing.

I was thinking Something more like this: image

Minnowo commented 1 year ago

I don't really understand what tbl_pmaps_abs_compositon does, or what tbl_abstract_program_map is doing.

I was thinking Something more like this: image

I forgot to add a column in tbl_pmap: is_abstracted with type bool.

jasonmzx commented 1 year ago

The reasoning behind Abstract Program Maps tbl_abstract_program_map is that this table holds Program Maps, which are composed of Program Maps (The not abstracted ones) , this is why we need 1 to Many for Abstracted program map to Program Maps (many), and the Programs Map -> Courses is 1 to many aswell.

I have concerns with linking tbl_term to the program maps since we want to avoid repetition at all costs (As I don't want the same Year 3, Fall Semester Electrical Engineering Entry duplicated for every Fall Semester), so I think a general semester ID to ENUM is more suitable for the program maps(semester: 1 is Fall, semester 2: Winter, 3: Summer) for example... so new entry on tbl_program_map which would have a static_semester_int which is 1,2 or 3?

Another issue that arises here is the volatility of tbl_course and tbl_term upon re-scraping, the course ids might be different? so we'll need to preform a re-index of tbl_pmap_course everytime... It might be worth it to include a course_string_identifier so when course ids change, we'll have a way to re-unite the courses to pmaps