CDLUC3 / ezid

CDLUC3 ezid
MIT License
10 stars 4 forks source link

Refactor "Download report in CSV format" #718

Closed jsjiang closed 1 month ago

jsjiang commented 2 months ago

After login an EZID user can download statistic report using "Download report in CSV format" functionality in the Dashboard page. There are few issues that need fixes:

  1. Numbers in the UI do not match to the numbers in the downloaded CSV file
  2. Accessing the "Download report in CSV" always triggers an Internal Server Error. Add error handling to avoid sending false error notifications.
Subject: [Django] ERROR (EXTERNAL IP): Internal Server Error: /dashboard/csv_stats
Internal Server Error: /dashboard/csv_stats

AttributeError at /dashboard/csv_stats
'NoneType' object has no attribute 'isSuperuser'

Request Method: HEAD
Request URL: http://ezid.cdlib.org/dashboard/csv_stats
Django Version: 4.2.15
Python Executable: /ezid/.pyenv/versions/ezid-py3/bin/python
Python Version: 3.8.5
sfisher commented 2 months ago

I've corrected the internal server errors which do not happen all the time but are caused by a user not being logged in. The fix is to use the decorator to require a logged in user for this action. I believe this was caused by opening a few windows or tabs. Opening the reports page and then logging out in another window and then trying to retrieve a report while logged out.

sfisher commented 2 months ago

I ran the python manage.py proc-stats on dev and let it run overnight and the stats seem consistent to me, though this entire page is a bit confusing since the UI isn't consistent and has a few problems.

  1. The stats produced by the Download the report in CSV format give all stats for users/groups/realms that the user is allowed to see and isn't limited in the same way as the interface. Also note that it says anonymous/test ids are excluded for this (and it probably makes sense not to track test identifiers).
    Test identifiers (and therefore anonymously-owned identifiers) are not included. Reserved identifiers are included, however.
  2. When changing "view report for" dropdown to a different group or realm it resets the dates in the report on the page to just the last year that stats are available. Similarly if changing the dates for the report shown in the page it completely resets any group or realm in the drop-down list so you can't filter by both values at once. This seems to be a UI problem that I hope is easy to correct. Probably making it not reset on changing dates is most important since when selecting different filters for user/realm/group the available dates change and it makes some sense to reset them to valid drop-down ranges since not all date ranges are available for all filtered data.
  3. I do not see discrepancies between the report values and the CSV values when I filter for the same data in something like Excel. It just seems like the report gives a user everything they could want and doesn't follow the rest of the UI filters on the page. It also seems like the Totals for IDs are for all time and are not for the dates selected in the lower section of the page.

Examples when I look on the page and what I see in Excel.


Page totals by month Image

vs csv (filtered in Excel to the same group)

Image


Totals in the page Image

vs totals (sums) in Excel:

Image


So I'm not seeing anything inconsistent with the stats when I test on dev after updating them. Examples of where stats are inconsistent would be helpful since I'm not seeing the problem to troubleshoot. Also we may need to ensure that the proc-stats is running nightly on all environments.

I can probably fairly easily modify the UI code so it doesn't reset the filter in the upper part of the page when picking new dates for reporting in the lower part of the page.

IDK if we want to modify everything here. Seems like download CSV includes all stats based on user permission. Summary stats at the top include all for the filter selected at the top, but not the date range selected at the bottom.

It is a confusing UI since it almost seems to be about 3 pages that work a bit differently munged into one page and it's not clear what filters apply to which sections. the sections are: 1) download all my stats as CSV; 2) filter and get totals for a group/realm with totals for all time; and 3) Filter to specific dates (and optionally with group/realm) to see a preview report for just that date range.

adambuttrick commented 1 month ago

Added with v3.2.24.