hypothesis / report

The internal global reporting product for Hypothesis
BSD 2-Clause "Simplified" License
0 stars 0 forks source link

Pie chart summaries of assignment types for an organization #230

Closed jon-betts closed 1 year ago

jon-betts commented 1 year ago

For:

Why are we doing it?

I think this is about quickly orienting staff about the size an nature of a customer when looking at their stats.

What are we doing

A pie chart of the different document types in use in an assignment for an organization somewhere near the top.

The pie chart would be determined by

We would not be uniquifying the documents by URL. This would be a count of the types of assignments not documents. So the same document in three different assignments, would count three times.

Tasks

What are file types?

File type Identifier
Blackboard files blackboard://
D2L d2l://
Canvas files canvas://
VitalSource vitalsource://
JSTOR jstor://
Google files https://drive.google.com ?
Microsoft OneDrive https://api.onedrive.com
YouTube https://www.youtube.com/watch
HTML / PDF http: or https:
No document Blank field
Other Whatever is left

I'm not sure that we can tell PDFs from HTML. Any new items we're not looking for are likely to be marked as "Other" anyway, so we could just roll them all up into that. It's possible some of the items that end up in the "HTML / PDF" category are actually something else we don't know to look for. For example a strange format of Google Drive link.

We can't currently do "YouTube" as described above, because it's not been released. And when it is released, we probably won't store the URL like that.

Effort estimate

jon-betts commented 1 year ago

FYI for myself about calculating this

WITH 
    assignment_documents AS (
        SELECT 
            id as assignment_id,
            TRIM(REPLACE(LOWER(document_url), '%3a%2f%2f', '://')) AS url
        FROM assignment
    ),
    documents AS (
        SELECT 
            url,
            CASE 
                WHEN url IS NULL OR url = '' THEN 'No Document'
                WHEN STARTS_WITH(url, 'blackboard://') THEN 'Blackboard Files'
                WHEN STARTS_WITH(url, 'd2l://') THEN 'D2L Files'
                WHEN STARTS_WITH(url, 'canvas://') THEN 'Canvas Files'
                WHEN STARTS_WITH(url, 'vitalsource://') THEN 'VitalSource'
                WHEN STARTS_WITH(url, 'jstor://') THEN 'JSTOR'
                WHEN STARTS_WITH(url, 'https://drive.google.com') THEN 'Google Drive'
                WHEN STARTS_WITH(url, 'https://api.onedrive.com') THEN 'Microsoft OneDrive'
                WHEN STARTS_WITH(url, 'https://www.youtube.com/watch') THEN 'YouTube'
                WHEN STARTS_WITH(url, 'http:/') THEN 'PDF / HTML'
                WHEN STARTS_WITH(url, 'https:/') THEN 'PDF / HTML'
                ELSE 'Other'
            END AS file_type 
        FROM assignment_documents
    )

SELECT 
    file_type,
    COUNT(1)
FROM documents
group by file_type
ORDER BY COUNT(1) DESC