tl-its-umich-edu / my-learning-analytics

My Learning Analytics (MyLA)
Apache License 2.0
36 stars 39 forks source link

remove reference to 'No Term' term from UDP course query #1499

Closed cosimps closed 1 year ago

cosimps commented 1 year ago

When working with a UDP only implementation of MyLA it will not import any course data for courses that it is installed for. The cron outputs the following:

[2023-03-13 11:00:59 -0400] [INFO] [cron.py:30] /code/config/cron_udp.hjson
[2023-03-13 11:00:59 -0400] [INFO] [cron.py:585] ** MyLA cron tab
[2023-03-13 11:00:59 -0400] [INFO] [cron.py:136] Course 42420000000111111 doesn't have an entry in data warehouse yet. It hasn't been updated locally, so skipping.
[2023-03-13 11:00:59 -0400] [INFO] [cron.py:136] Course 42420000000222222 doesn't have an entry in data warehouse yet. It hasn't been updated locally, so skipping.
[2023-03-13 11:00:59 -0400] [INFO] [cron.py:136] Course 42420000000333333 doesn't have an entry in data warehouse yet. It hasn't been updated locally, so skipping.
[2023-03-13 11:00:59 -0400] [INFO] [cron.py:136] Course 42420000000444444 doesn't have an entry in data warehouse yet. It hasn't been updated locally, so skipping.

Despite the courses existing in the database:

        id         | canvas_id | enrollment_term_id |   name   |      start_at       |     conclude_at     
-------------------+-----------+--------------------+----------+---------------------+---------------------
 42420000000222222 |    222222 |  42420000000000222 | SS93K    | 2023-03-31 00:00:00 | 2023-05-19 00:00:00
 42420000000333333 |    333333 |  42420000000000222 | SG31     | 2023-01-24 00:00:00 | 2023-05-19 00:00:00
 42420000000444444 |    444444 |  42420000000000222 | SMM72A   | 2023-01-24 00:00:00 | 2023-04-23 00:00:00
 42420000000111111 |    111111 |  42420000000000001 | LAFS     | 2020-02-24 00:00:00 | 

This is due to the reference of the term named No Term in the courses query of the cron_udp.hjson config. By removing that we are able to successfully import courses attached to any term for MyLA.

This PR removes the reference for the No Term term that enabled us to successfully run the MyLA imports.

jonespm commented 1 year ago

I'll look at this. I know we specifically added this "No Term" clause because in the initial version of this the courses that were in the the "No Term" category weren't coming up.

This was because academic_term_id was null in the UDP. Is academic_term_id null in these cases you have too? I'm not sure how this query could work but I can try and see if it produces similar results.

cosimps commented 1 year ago

Not in any of the courses for this install, they all have an id in the academic_term_id column.

zqian commented 1 year ago

The 'no term' is not included inside academic_session table (since it is based on PeopleSoft/SIS data, not on Canvas), but the 'no term' term is indeed included inside academic_term table.

It turns out the joining with academic_session table is not necessary. The query can be written as the following:

    SELECT
        cast(co2.lms_int_id as BIGINT) as id,
        cast(co2.lms_ext_id as BIGINT) as canvas_id,
        cast(at2.lms_int_id as BIGINT) as enrollment_term_id,
        co.le_code as name,
        co.le_start_date::timestamp without time zone as start_at,
        co.le_end_date::timestamp without time zone as conclude_at
    FROM
        entity.course_offering co
        LEFT OUTER JOIN entity.academic_term at1 on (co.academic_term_id = at1.academic_term_id),
        keymap.course_offering co2,
        keymap.academic_term at2
        WHERE co2.lms_int_id in %(course_ids)s
        and co.course_offering_id = co2.id
        and at1.academic_term_id = at2.id

@cosimps and @jonespm Could you please review?

cosimps commented 1 year ago

@zqian that query does appear to return the same datasets as the original change

jonespm commented 1 year ago

Hmm, I pulled the list of all the course ids to use from our production to substitute in %(course_ids)s there are 537 courses defined there.

SET SESSION group_concat_max_len = 16384;
select CONCAT("'", GROUP_CONCAT(id order by id SEPARATOR "','"), "'") from course where id is not null order by id;

The query on the PR 'as-is' only seems to return 51 rows, the query Zhen proposed returns 537 rows, our original query returns 537 rows.

So I'd probably close this one and @zqian can open one with her change.

cosimps commented 1 year ago

The query on the PR 'as-is' only seems to return 51 rows, the query Zhen proposed returns 537 rows, our original query returns 537 rows.

So I'd probably close this one and Zhen can open one with her change.

@jonespm I tested both queries against the 6 instances of myla we host and both return the same set of data, so that option is fine with me

zqian commented 1 year ago

As suggested, I will close this PR. Please track the progress in the new PR: #1505