Closed julienye closed 1 year ago
@julienye Are you able to directly run queries against the databases?
Yes, using DBeaver. The "division by zero" error didn't specify where the error occurred, but when I removed the ratio-calculation code from the script, it generated the counts without error.
On Mon, Jan 24, 2022 at 3:40 PM scwiek @.***> wrote:
@julienye https://github.com/julienye Are you able to directly run queries against the databases?
— Reply to this email directly, view it on GitHub https://github.com/openlibraryenvironment/reshare-analytics/issues/38#issuecomment-1020527659, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVUFOPD6UFKKL72U56RHAKLUXW2MRANCNFSM5MWMAW6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Can you post an SQL query that shows the error?
Assuming September is one of the bad months, give this a try.
WITH parameters AS ( SELECT '2021-09-01'::date AS start_date, '2021-10-01'::date AS end_date ) SELECT * FROM ( SELECT rs_requester AS requester, sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS reqs, sum( CASE WHEN rs_to_status = 'REQ_CANCELLED' THEN 1 ELSE 0 END) AS cancelled, sum( CASE WHEN rs_to_status = 'REQ_END_OF_ROTA' THEN 1 ELSE 0 END) AS unfilled, sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') OR rs_to_status = 'REQ_FILLED_LOCALLY' THEN 1 ELSE 0 END) AS received, round(coalesce((sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') THEN 1 ELSE 0 END) / nullif (cast(sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS decimal), 0)), 0), 2) AS filled_ratio FROM reshare_derived.req_stats WHERE rs_date_created >= ( SELECT start_date FROM parameters) AND rs_date_created < ( SELECT end_date FROM parameters) GROUP BY requester UNION SELECT 'consortium' AS requester, sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS reqs, sum( CASE WHEN rs_to_status = 'REQ_CANCELLED' THEN 1 ELSE 0 END) AS cancelled, sum( CASE WHEN rs_to_status = 'REQ_END_OF_ROTA' THEN 1 ELSE 0 END) AS unfilled, sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') OR rs_to_status = 'REQ_FILLED_LOCALLY' THEN 1 ELSE 0 END) AS received, round(coalesce((sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') THEN 1 ELSE 0 END) / nullif (cast(sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS decimal), 0)), 0), 2) AS filled_ratio FROM reshare_derived.req_stats WHERE rs_date_created >= ( SELECT start_date FROM parameters) AND rs_date_created < ( SELECT end_date FROM parameters) GROUP BY requester) AS core_req ORDER BY ( CASE WHEN core_req.requester = 'consortium' THEN 0 ELSE 1 END), core_req.requester;
As it happens, September is not one of the bad months, but October is, so substitute this for the beginning of the query scwiek just posted:
WITH parameters AS ( SELECT '2021-10-01'::date AS start_date, '2021-10-31'::date AS end_date ) SELECT * ...
Member Gettysburg has 0 requests in October...
On Mon, Jan 24, 2022 at 3:59 PM scwiek @.***> wrote:
Assuming September is one of the bad months, give this a try.
WITH parameters AS ( SELECT '2021-09-01'::date AS start_date, '2021-10-01'::date AS end_date ) SELECT * FROM ( SELECT rs_requester AS requester, sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS reqs, sum( CASE WHEN rs_to_status = 'REQ_CANCELLED' THEN 1 ELSE 0 END) AS cancelled, sum( CASE WHEN rs_to_status = 'REQ_END_OF_ROTA' THEN 1 ELSE 0 END) AS unfilled, sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') OR rs_to_status = 'REQ_FILLED_LOCALLY' THEN 1 ELSE 0 END) AS received, round(coalesce((sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') THEN 1 ELSE 0 END) / nullif (cast(sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS decimal), 0)), 0), 2) AS filled_ratio FROM reshare_derived.req_stats WHERE rs_date_created >= ( SELECT start_date FROM parameters) AND rs_date_created < ( SELECT end_date FROM parameters) GROUP BY requester UNION SELECT 'consortium' AS requester, sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS reqs, sum( CASE WHEN rs_to_status = 'REQ_CANCELLED' THEN 1 ELSE 0 END) AS cancelled, sum( CASE WHEN rs_to_status = 'REQ_END_OF_ROTA' THEN 1 ELSE 0 END) AS unfilled, sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') OR rs_to_status = 'REQ_FILLED_LOCALLY' THEN 1 ELSE 0 END) AS received, round(coalesce((sum( CASE WHEN (rs_from_status = 'REQ_SHIPPED' AND rs_to_status = 'REQ_CHECKED_IN') THEN 1 ELSE 0 END) / nullif (cast(sum( CASE WHEN rs_to_status = 'REQ_VALIDATED' THEN 1 ELSE 0 END) AS decimal), 0)), 0), 2) AS filled_ratio FROM reshare_derived.req_stats WHERE rs_date_created >= ( SELECT start_date FROM parameters) AND rs_date_created < ( SELECT end_date FROM parameters) GROUP BY requester) AS core_req ORDER BY ( CASE WHEN core_req.requester = 'consortium' THEN 0 ELSE 1 END), core_req.requester;
— Reply to this email directly, view it on GitHub https://github.com/openlibraryenvironment/reshare-analytics/issues/38#issuecomment-1020540582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVUFOPHMEHUL5W4VUIYBQ5TUXW4SPANCNFSM5MWMAW6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
@julienye Did the query I posted (with your date adjustments) still produce the "division by zero" error?
oh, sorry! I didn't realize you wanted me to try it! Your revised query does not produce the "division by zero" error. However, it did omit Gettysburg (which had all zero counts) from the results. I think we'd prefer to see them included, with zeroes, for ease of comparison with other months - if that's feasible.
I also ran it against November data, when Widener had 0 requests but 1 received.
No error, and the results were what I'd expect to see:
requester | reqs | cancelled | unfilled | received | filled_ratio widener | 0 | 0 | 0 | 1 | 0
Thanks!
Thanks @julienye, that's helpful information.
@scwiek I had the opportunity to review reports with @debradenault yesterday, and she confirmed what I alluded to in an earlier comment. It appears that when an institution has 0 requests in whatever date range the report is run, that institution is omitted from the results. Debra stated that all institutions should appear in the results list, e.g., to facilitate comparison with other months, even if they have no data for the selected time period.
Can you update the core requesting and core supplying scripts so that 0-request rows are not dropped, instead supplying 0 for all columns and calculations?
As a reminder, October 2021 is a good month for testing core_requesting @ PALCI: Gettysburg had 0 requests. And November 2021 is good for testing core_supplying @ PALCI: Misericordia had 0 supply requests.
Thanks!
@julienye I might have a fix to add the empty institutions to the core reports. Once #42 is merged in by Nassib and pushed into the live systems, can you run this query against the CNY and/or PALCI systems to see if it's displaying as desired?
Sure - happy to test whenever!
On Mon, Feb 21, 2022 at 4:06 PM scwiek @.***> wrote:
@julienye https://github.com/julienye I might have a fix to add the empty institutions to the core reports. Once #42 https://github.com/openlibraryenvironment/reshare-analytics/pull/42 is merged in by Nassib and pushed into the live systems, can you run this query https://github.com/scwiek/reshare-analytics/blob/empty-test/sql/report_queries/core_requesting/core_req.sql against the CNY and/or PALCI systems to see if it's displaying as desired?
— Reply to this email directly, view it on GitHub https://github.com/openlibraryenvironment/reshare-analytics/issues/38#issuecomment-1047228186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVUFOPEXMTTQYQZ7LQZO7RDU4KSODANCNFSM5MWMAW6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
I generated monthly reports for PALCI and CNY for individual months in 2021. Most queries failed to execute with a "division by zero" error, which I was able to work around by skipping the ratio calculations in the script, and doing those calculations later, offline.
I believe the error occurs if a library has 0 requests or 0 supply requests during the specified time period. All report queries must be able to handle this case. Suggestion: either return "N/A" or 0 for the calculated ratio, when the denominator is 0.