Open zqian opened 4 years ago
A problem with this approach is that not all Caliper events contain the same data.
For example, MiVideo events sent by Kaltura do not include users' Canvas user ID number. Users' ID numbers can be found in other events using these query fragments:
CAST(SUBSTR(JSON_EXTRACT_SCALAR(event, '$.membership.member.id'), 29) AS INT64) AS user_id
@canvas_data_id_increment + CAST(JSON_EXTRACT_SCALAR(event, '$.federatedSession.messageParameters.custom_canvas_user_id') AS INT64) AS user_id
But that's not possible with MiVideo events. The closest data those events have to identifying the user is the login name, which is part of the actor ID (currently, the MiVideo events are not slim), retrieved this with segment:
replace(JSON_EXTRACT_SCALAR(event, '$.actor.id'), 'https://aakaf.mivideo.it.umich.edu/caliper/info/user/', '') AS user_login_name
New code is added (in #826) to the cron process to convert user_login_name
to user_id
using data from MyLA's user
table.
855 Expected behavior (A clear and concise description of what you expected to happen) :
Describe the bug (Tell us what happens instead of the expected behavior) :
The long query string for BigQuery could be improved
Matt suggested following in issue #848: "I think these queries can be simplified in other ways though by pulling the patterns out of them but that's a fix I haven't gotten to yet.
For example:
SELECT 'canvas' AS resource_type, CAST(SUBSTR(JSON_EXTRACT_SCALAR(event, '$.object.id'), 35) AS STRING) AS resource_id, CAST(SUBSTR(JSON_EXTRACT_SCALAR(event, '$.membership.member.id'), 29) AS INT64) AS user_id, CAST(SUBSTR(JSON_EXTRACT_SCALAR(event, '$.group.id'),31) AS INT64) AS course_id, COALESCE(JSON_EXTRACT_SCALAR(event, '$.object.extensions[\'com.instructure.canvas\'].asset_name'), JSON_EXTRACT_SCALAR(event, '$.object.extensions[\'com.instructure.canvas\'].filename'), JSON_EXTRACT_SCALAR(event, '$.object.name'), 'attachment') as name, datetime(EVENT_TIME) as access_time` Could probably be something like
resource_type: "canvas", resource_id: "SUBSTR(JSON_EXTRACT_SCALAR(event, '$.object.id'), 35)", user_id: "SUBSTR(JSON_EXTRACT_SCALAR(event, '$.membership.member.id'), 29)", course_id: "SUBSTR(JSON_EXTRACT_SCALAR(event, '$.group.id'),31)", This could probably be improved further since many of these are pulling substrings out of the json event so that could be special too.
We could probably improve on all other parts of this query too by finding and replacing the patterns, or using some "methods" to do that. I really don't like how this is either but this is the real query. ;)"