CDLUC3 / ezid

CDLUC3 ezid
MIT License
11 stars 4 forks source link

[RESEARCH] Evaluate Post-Processing Apache Logs for EZID Usage Analytics #670

Closed adambuttrick closed 2 months ago

adambuttrick commented 3 months ago

Describe the current state/issue Part of #666. EZID currently uses Matomo for web analytics, which was found to be not performant with the increased traffic following the N2T resolution functionality cut-over.

Describe the desired state/solution Investigate the viability and potential benefits of implementing a post-processing approach for EZID usage analytics using Apache log files. The investigation should specifically:

  1. Assess the viability of using Matomo (or a similar tool) to parse Apache log files for analytics instead of real-time tracking.
  2. Compare the performance and availability of post-processed log analysis vs. current real-time tracking, especially under high load from resolution requests or similar traffic.
  3. Identify any gaps in functionality or data granularity compared to the current real-time Matomo implementation.
  4. Evaluate the effort required to implement a log parsing solution and integrate it with existing analytics processes and dashboards.
  5. Assess whether post processing would impede real-time troubleshooting or incident handling.
  6. Develop a proof-of-concept implementation to validate findings from the above points.
adambuttrick commented 2 months ago

Per discussion with Jing, we would prefer using OpenSearch over Matomo processing the log files.