Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
13.08k stars 1.51k forks source link

Add a Prometheus Exporter #3742

Closed pichouk closed 3 years ago

pichouk commented 3 years ago

Hi,

we are hosting a Peertube instance for our association and we kinda like the statistics shown on /about/instance page (this component) : number of videos, view, etc.

I think it would be nice to have a Prometheus endpoint under /metrics to be able to scrape those metrics using Prometheus. IMHO Prometheus is the leading metrics-scrapping solution and integrate an Prometheus endpoint in Peertube is a feature that can be useful for several organizations hosting Peertube.

First, I am opening this issue to discuss about the feature. Do you think it could be a good idea to add a Prometheus endpoint to Peertube ? Second, here is how I imagine this feature :

I am not really a developer, but I can try to implement the feature. I will probably need help to do it correctly, but I can try :)

What do you think ?

Chocobozzz commented 3 years ago

Hello,

I don't want to maintain specific endpoints for every monitoring system. Prometheus does not support JSON input? At Framasoft we just use the REST API using telegraf to graph instance stats: https://peertube2.cpy.re/api/v1/server/stats

pichouk commented 3 years ago

Sorry I didn't know that those statistics where available under /api/v1/server/stats easily. I would have prefer to have a Prometheus format, but I agree that this is not a good idea to maintain several monitoring systems integrations. This endpoint can be use to convert metrics to Prometheus format with a little script, I'll do that.

Thanks :)

faust64 commented 3 years ago

As I was looking for a Prometheus exporter as well.

Based on previous replies, I tried the Prometheus (community) JSON Exporter -- quay.io/prometheuscommunity/json-exporter

Here's a sample configuration, gathering metrics from PeerTube:

metrics:
- name: peertube_local_videos_count
  path: "{ .totalLocalVideos }"
  help: Peertube Local Videos Count
- name: peertube_local_video_views_count
  path: "{ .totalLocalVideoViews }"
  help: Peertube Local Video Views
- name: peertube_local_video_size_bytes
  path: "{ .totalLocalVideoFilesSize }"
  help: Peertube Local Video Files Size
- name: peertube_local_video_comments_count
  path: "{ .totalLocalVideoComments }"
  help: Peertube Local Comments Count
- name: peertube_all_videos_count
  path: "{ .totalVideos }"
  help: Peertube Videos Count
- name: peertube_all_video_comments_count
  path: "{ .totalVideoComments }"
  help: Peertube Comments Count
- name: peertube_users_count
  path: "{ .totalUsers }"
  help: Peertube Users Count
- name: peertube_daily_active_users_count
  path: "{ .totalDailyActiveUsers }"
  help: Peertube Active Users Count - last 24h
- name: peertube_weekly_active_users_count
  path: "{ .totalWeeklyActiveUsers }"
  help: Peertube Active Users Count - last 7d
- name: peertube_monthly_active_users_count
  path: "{ .totalMonthlyActiveUsers }"
  help: Peertube Active Users Count - last 30d
- name: peertube_followers_count
  path: "{ .totalInstanceFollowers }"
  help: Peertube Instances Followers Count
- name: peertube_following_count
  path: "{ .totalInstanceFollowing }"
  help: Peertube Instances Following Count
- name: peertube_pub_processed_count
  path: "{ .totalActivityPubMessagesProcessed }"
  help: Peertube Activity Pub Messages Processed Count
- name: peertube_pub_processed_per_sec_avg
  path: "{ .activityPubMessagesProcessedPerSecond }"
  help: Peertube Activity Pub Messages Processed Per Second
- name: peertube_pub_waiting_count
  path: "{ .totalActivityPubMessagesWaiting }"
  help: Peertube Activity Pub Messages Waiting Count

Now, the JSON exporter needs a target to be passed, as a GET param, when Prometheus queries your exporter to scrape metrics. Working with Kubernetes, I'ld rather try to avoid any probe with params, as it requires a corresponding configuration in your Prometheus server... Writing a PeerTube exporter from scratch was easier:

https://gitlab.com/synacksynack/opsperator/docker-peertubeexporter/-/blob/master/config/peertube_exporter.py

#!/usr/bin/env python

import os

if os.environ.get('EXPORTER_PORT') is not None:
    bind_port = os.environ.get('EXPORTER_PORT')
else:
    bind_port = 9113
if os.environ.get('PEERTUBE_URL') is not None:
    peertube_root = os.environ.get('PEERTUBE_URL')
else:
    peertube_root = 'http://localhost:9000'

peertube_url = peertube_root + '/api/v1/server/stats'

import json
import requests
import sys
import time

from prometheus_client import start_http_server
from prometheus_client.core import GaugeMetricFamily, CounterMetricFamily, REGISTRY

class JSONCollector(object):
    def collect(self):
        response = json.loads(requests.get(peertube_url).content.decode('UTF-8'))

        yield GaugeMetricFamily('peertube_local_videos_count',
                                'Peertube Local Videos Count',
                                int(response['totalLocalVideos']))
        yield GaugeMetricFamily('peertube_local_video_views_count',
                                'Peertube Local Videos Views',
                                int(response['totalLocalVideoViews']))
        yield GaugeMetricFamily('peertube_local_video_size_bytes',
                                'Peertube Local Videos Space Used in Bytes',
                                int(response['totalLocalVideoFilesSize']))
        yield GaugeMetricFamily('peertube_local_video_comments_count',
                                'Peertube Local Videos Comments Count',
                                int(response['totalLocalVideoComments']))
        yield GaugeMetricFamily('peertube_all_videos_count',
                                'Peertube Total Videos Count',
                                int(response['totalVideos']))
        yield GaugeMetricFamily('peertube_all_video_comments_count',
                                'Peertube Total Videos Comments Count',
                                int(response['totalVideoComments']))
        yield GaugeMetricFamily('peertube_users_count',
                                'Peertube Total Users Count',
                                int(response['totalUsers']))
        yield GaugeMetricFamily('peertube_active_users_daily_count',
                                'Peertube Total Users Active in the last day',
                                int(response['totalDailyActiveUsers']))
        yield GaugeMetricFamily('peertube_active_users_weekly_count',
                                'Peertube Total Users Active in the last week',
                                int(response['totalWeeklyActiveUsers']))
        yield GaugeMetricFamily('peertube_active_users_monthly_count',
                                'Peertube Total Users Active in the last month',
                                int(response['totalMonthlyActiveUsers']))
        yield GaugeMetricFamily('peertube_followers_count',
                                'Peertube Followers Instances Count',
                                int(response['totalInstanceFollowers']))
        yield GaugeMetricFamily('peertube_following_count',
                                'Peertube Following Instances Count',
                                int(response['totalInstanceFollowing']))
        yield GaugeMetricFamily('peertube_pub_processed_count',
                                'Peertube Processed Publications Count',
                                int(response['totalActivityPubMessagesProcessed']))
        yield GaugeMetricFamily('peertube_pub_processed_per_sec_avg',
                                'Peertube Average Publications Processed per Second',
                                int(response['activityPubMessagesProcessedPerSecond']))
        yield GaugeMetricFamily('peertube_pub_waiting_count',
                                'Peertube Waiting Publications Count',
                                int(response['totalActivityPubMessagesWaiting']))

if __name__ == "__main__":
    REGISTRY.register(JSONCollector())
    start_http_server(int(bind_port))
    print("Exporter started - listening on :%s" % bind_port)
    while True:
        time.sleep(1)

Just make sure to pip install request and prometheus_client.

Now, considering there are tons of libraries exporting metrics in NodeJS, it is regrettable to stick with that JSON object. Prometheus exporter libraries tend to show more than just your applicative metrics, see prometheus-api-metrics, express-prometheus-middleware. Or, at the lower level, prom-client. Could be useful gathering data such nodejs event loop, express.js requests breakdown, ... would be nice to have those as well.

drym3r commented 2 years ago

@faust64 FYI if you use the json-exporter helm chart, it's not really necessary, since it can create the serviceMonitor and it doesn't have any special needs:

...
serviceMonitor:
  enabled: true
  defaults:
    labels:
     # The label that your prometheus search on the serviceMonitors
      release: prometheus
  targets:
    - name: whatever
      url: https://whatever.com/api/v1/server/stats
      interval: 60s
      scrapeTimeout: 60s
      additionalMetricsRelabels: {}
...
faust64 commented 2 years ago

I did mention the json-exporter, and why I find it to be impractical

Regarding ServiceMonitors: this CRD is not standard. To use a ServiceMonitor, you don't need Prometheus: you need to have the Prometheus Operator deployed. It's sad, I would see this confusion very often, on other projects I follow as well, ...

For the record: Prometheus operator != Prometheus Prometheus operator != stable/production-ready/... Prometheus operator is not maintained by, nor a part of the Prometheus project.

Prometheus can work outside of Kubernetes. Prometheus operator can't.

If you're familiar with OpenShift: or other RedHat's contributions to opensource, you would think twice about setting one up.

The Prometheus Operator project has been beta for years. The Prometheus configuration it generates is highly debatable, while you can't "fix" those mistakes it makes, unless you can get your contribution merged into that operator codebase. I would argue that once you know how to configure Prometheus: that operator is something you won't install ever again.

Prometheus has built-in Kubernetes Services discovery. ServiceMonitors are useless, over-complicating what can be done natively, while introducing security concerns. ServiceMonitors are not portable: to anyone that would not be using prometheus operator, (eg: setting up Prometheus outside of Kubernetes, where making such a mistake just isn't an option).

... but that's another debate. it could work, granted you don't mind much about learning prometheus, and are using Kubernetes. Which is stretching the initial question. There's no mention of Kubernetes. And it makes sense: deploying PeerTube in kube is fun, for test, debugs, ... Going to prod, it's unlikely you can sell your customers on deploying a Kubernetes cluster, alongside PeerTube.

drym3r commented 2 years ago

I see your point, but CRD are by definition not standard, so I do not agree, I've been using the operator for about 3 years and I love it. I actually do not understand about having k8s and prometheus and not using it, that's why I assumed you were using it x)

Anyway, that's besides the point as you said. As you can see on the above linked issues, I'd like to make some documentation about how to use Peertube metrics, so I've you have updated information about how to use your exporter, it would be cool to add it. I can see that it doesn't have all the metrics, there's a bunch more of ActivityPub metrics now, if you want to update it.