NCEAS / metacatui

MetacatUI: A client-side web interface for DataONE data repositories
https://nceas.github.io/metacatui
Apache License 2.0
42 stars 26 forks source link

Portal Metrics are missing data from recent months #1927

Open vchendrix opened 2 years ago

vchendrix commented 2 years ago

Describe the bug In looking at the public portals in ESS-DIVE, I noticed that there was no recent metrics (downloads, views citations) on the portal summary page. However, these metrics show up in the search results and medata landing page.

To Reproduce Steps to show example Missing Download and View Metrics:

  1. Go to https://data.ess-dive.lbl.gov/portals/east-river-watershed/Metrics
  2. Scroll down to Downloads and Views metrics
  3. See that there is not data in the last year
  4. Now Go to one of the datasets in the portal : https://data.ess-dive.lbl.gov/view/doi:10.15485/1834207
  5. Click on Downloads
  6. Notice that the there were 4 downloads in Nov 2021
  7. Click on Views
  8. Notice that the there were views In Oct 2021 - Dec 2021

Steps to show missing citations

  1. Go to https://data.ess-dive.lbl.gov/portals/CDIAC/Metrics
  2. Scroll to the bottom
  3. Notice that there are no citations
  4. Scroll back to top
  5. Click on Data
  6. Notice that there are several datasets with citations

Expected behavior The metrics on the dataset landing page should be showing up on the portal metrics page.

Screenshots

Screen Shot 2021-12-10 at 9 06 19 AM Screen Shot 2021-12-10 at 9 05 58 AM Screen Shot 2021-12-10 at 9 05 40 AM

Desktop (please complete the following information):

laurenwalker commented 2 years ago

@rushirajnenuji Do you have an idea on what is causing this issue?

rushirajnenuji commented 2 years ago

Hi @laurenwalker - the first part (usage metrics) of the issue is solved. The citation metrics issue is related to DataONEorg/metrics-service#88 issue and is still a work in progress.

laurenwalker commented 2 years ago

Thanks @rushirajnenuji , I added that ticket as a dependency to this UI ticket.

gothub commented 2 years ago

@rushirajnenuji is the portal use metrics still an issue?

Looking at ADC and ESS_DIVE metrics today, there are stats for Jan 2022 but none for Feb 2022. What is the best way to verify that these stats are current?

Screen Shot 2022-02-22 at 2 47 56 PM

Screen Shot 2022-02-22 at 2 48 12 PM

gothub commented 2 years ago

Hey @rushirajnenuji these other portals don't report data for Jan, Feb 2022. Could there be a problem with these as well, or has there just been no activity?

No data for Jan, Feb 2022 https://data.ess-dive.lbl.gov/portals/CDIAC/Metrics https://data.ess-dive.lbl.gov/portals/WHONDRS/Metrics https://data.ess-dive.lbl.gov/portals/east-river-watershed/Metrics https://arcticdata.io/catalog/portals/DBO/Metrics https://arcticdata.io/catalog/portals/CALM/Metrics

Thx.

mbjones commented 2 years ago

The SASAP portal on DataONE also looks suspect, with potentially missing metrics in Dec-Feb2022 and a good chunk of last year.

See https://search.dataone.org/portals/SASAP/Metrics

rushirajnenuji commented 2 years ago

Hi Peter, Matt - there is a problem with the portals indexing. The ElasticSearch index was lagging behind with the events. I'm working on getting these fixed.

re: SASAP portal missing events from early 2021 - I'm not sure why this happened. I'll have to look more and might have to reindex some of the data.

mbjones commented 2 years ago

@rushirajnenuji Can this metrics indexing issue resolved and can it be closed out now?

vchendrix commented 2 years ago

Hello there. We should follow up with @mburrus. She has noticed that portal metrics are still an issue.

robyngit commented 1 year ago

@vchendrix @mburrus, are portal metrics still missing from recent months, or has this been resolved?

mbjones commented 1 year ago

There are counts as of May 2023, looks pretty good:

https://data.ess-dive.lbl.gov/profile https://arcticdata.io/catalog/profile

mburrus commented 1 year ago

@robyngit it looks like a few of our data portals are still missing view/download counts from recent months. I haven't checked in on this issue in a while so the following observations are just from my experience today.

Note that ESS-DIVE is running on metacatUI v2.22, so the bugs resolved in issues #2041 and #2088 and merged into v2.23 are still on some of our portals (I listed those at the very bottom as reference). @vchendrix can confirm

Browser: Firefox 113.0.1 OS: MacOS Ventura 13.3.1

Missing counts from recent months I only checked public ESS-DIVE data portals. The portals listed here are all missing metrics from April-May while some are missing counts from February and March. Only one portal is falsely reporting 0 counts for view/download. I checked one dataset landing page per portal and confirmed there are view/download counts for the missing months.

Browser affected metric behavior:

Screenshots After a few dozen attempts of opening/closing portals I finally hit an error message (on Firefox). Note that I was not on a metrics page but rather attempting to load a freeform page when this happened https://data.ess-dive.lbl.gov/portals/watershed-function-sfa. If it's unrelated then I'll remove the screenshots. Screenshot 2023-05-18 at 11 40 54 AM Screenshot 2023-05-18 at 11 42 30 AM


Finally just for reference, these portals are affected by bugs resolved in issues #2041 and #2088 which should be fixed when we upgrade to v2.23: https://data.ess-dive.lbl.gov/portals/watershed-function-sfa/Metrics https://data.ess-dive.lbl.gov/portals/reporting-formats/Metrics https://data.ess-dive.lbl.gov/portals/CDIAC/Metrics https://data.ess-dive.lbl.gov/portals/WHONDRS/Metrics https://data.ess-dive.lbl.gov/portals/EXCHANGE/Metrics

This data portal has no metric issues! https://data.ess-dive.lbl.gov/portals/NGEE-Arctic/Metrics

mbjones commented 1 year ago

The metrics take a while to load, but even waiting for them to load on the first link you provided show the metrics are missing for March, April, and May. See: https://data.ess-dive.lbl.gov/portals/east-river-watershed/Metrics

@rushirajnenuji can you look into this and the other errors reported by @mburrus please?

mbjones commented 1 month ago

@rushirajnenuji @robyngit one+ year later and we're still seeing this metrics issue. April-June missing for ADC, and June missing for ESS-DIVE. Probably others as well. What's our path forward on solving this?

image image
mburrus commented 1 week ago

@mbjones @rushirajnenuji @robyngit A user reported today that their project data portal citation metrics are also not updating with the latest manual citations registered 🙃 This is true for ESS-DIVE's summary metrics as well. Being able to count all project citations seems important to their project management at this time and there actually isn't any way to accurately find out how many citations are registered within a data portal since the preview icons also are not accurate: #2225 .

I can only say that some citations manually registered on June 24th are not showing up in metric summaries. I believe this group has been registering a lot of citations at once for the past month, maybe two, so it's possible that we're only missing citations since the last time the aggregation log was turned on (6/18 according to our Slack conversation). Here is a quote from the user as of today:

"I believe that the citations that are showing up in the portal might have been inputted by ESS-DIVE members for one of the use cases, but am not positive (it could have been Amy). None of the citations I inputted are showing up in the portal. "


Details

Data Portal Metrics: If you look at the PNNL River Corridor SFA portal metrics, it says there are 13 citations in the portal. However when I visited their portal's data page I counted 32 citations in the view/download/citation count preview icons. Because of the ongoing issue with inconsistent reporting between these icons and the landing page counts (issue referenced above), there are definitely more than 32 citations for this project.

ESS-DIVE Summary Metrics: One of their datasets currently has 1 citation that was manually registered, if you search for this citation on ESS-DIVE's summary page it cannot be found. Additionally, I searched for "2024" and no dataset or paper citations from 2024 are in the summary table.

Allison N Myers-Pigg, Samantha Grieger, J Alan Roebuck, Morgan E Barnes, Kevin D Bladon, John D Bailey, Riley Barton, Rosalie K Chu, Emily B Graham, Khadijah K Homolka, William Kew, Andrew S Lipton, Timothy Scheibe, Jason G Toyoda, & Sasha Wagner. (2024). Experimental Open Air Burning of Vegetation Enhances Organic Matter Chemical Heterogeneity Compared to Laboratory Burns. Environmental Science & Technology. Vol. 58. pp. 9679-9688. doi:10.1021/acs.est.3c10826](https://doi.org/10.1021/acs.est.3c10826)

Comparing this with my records These are all the records I have of citations registered with ESS-DIVE. I recall seeing the same count for a long time and not bothering to record the number since it seemed like people weren't using the registration feature. May 2022: 235 November 2022: 235 July 2024: 270