Closed bmtcril closed 1 year ago
@SoryRawyer heads up on this one, can you take a look?
I think this might be an artifact of dbt not cleaning up when models are removed. The enrollments_by_day
query was moved into a Superset virtual dataset since it uses window functions and therefore would not utilize indexes when queried if created as a view by dbt. I think the issue of excluding the current day in the window query was addressed in this PR: https://github.com/openedx/tutor-contrib-aspects/pull/260
Are you seeing that the current day is still excluded in the instructor dashboard?
Hmm no, I was trying to answer a question for @Ian2012 about where to query enrollment information (total learners ever enrolled, including inactive and total active enrollments) but must have had old views. So for this information he should look to the virtual dataset for prior art?
I think I'd err on the side of using reporting.fact_enrollments
when possible, but I think the enrollments_by_day
virtual dataset is going to be the best way to answer questions like "how many learners were enrolled on a specific day?". Unfortunately, querying virtual datasets isn't the best experience, since you need to use the dataset macro and you need to pass the integer ID of the dataset, so it'd look something like this:
select count(*)
from {{ dataset(40) }}
where enrollment_status_date = '2023-05-01'
Now that we're using dictionaries to join the entity names, I think the last ClickHouse issue that's keeping us from moving this into dbt is the fact that views containing window queries don't use any indexes. I haven't looked at the upcoming ClickHouse improvements, so I'm not sure if that's on the horizon or if we need to think of more permanent guidance for where to query.
Would it be worth creating something like a "cookbook" for common query patterns?
We can also create a custom macro that returns the ID corresponding for a given UUID. Not sure if that's suitable, but worth the try
I think it's probably best to hold off a little longer until we can make a call on RBAC. If we can get 90% of this in dbt and just need to add in filters it will make things a lot easier.
This has been fixed for a while 👍
I would expect there to be rows returned for the following query, but instead it is 0.
I'm able to get the expected rows by changing
int_enrollment_windows
window_end_date
to:dateTrunc('day', coalesce(window_end_at, today() + toIntervalDay(1))) AS window_end_date,
but I'm not sure what the impacts of that change are. I think this is just a "midnight time boundary" issue?