metno / discovery-metadata-catalog-ingestor

Apache License 2.0
1 stars 1 forks source link

back to GunicornPrometheusMetrics, CollectorRegistry, port 9200 #244

Closed charlienegri closed 1 month ago

charlienegri commented 1 month ago

Summary: this again for me exposes all the metrics at port 9200 when I run the container with docker....

Related issue: #220

Suggested reviewer(s):

Reviewer checklist:

The checklist is based on the S-ENDA conventions and definition of done (see General Conventions). The above points are not necessarily relevant to all contributions. In that case, please add a short explanation to help the reviewer.

charlienegri commented 1 month ago

Screenshot from 2024-10-08 11-55-33 @magnarem if this does not work I am out of ideas, this works for me locally, again

charlienegri commented 1 month ago

will need this too https://gitlab.met.no/tjenester/s-enda/-/merge_requests/2930

charlienegri commented 1 month ago

Approved. And merged. Le'ts hope this works now.

If not, I bet the issue is related to the wsgi.py file. Since the gunicorn process loads wsgi:appthen in the wsgi.py file, I am not sure if the lines after app = App() is then loaded by gunicorn.

Also in the statefulset.yml for dmci in the tjenester-repo. The gunicorn command from the Dockerfile is overrided, so you will probably need to add the -c /src/gunicorn_prometheus_config.py to the command: for the dmci container in the statefulset.yml too. And make sure the 9200 is exposded and scraped

we got the flask metrics exposed correctly at the 9200 port tho initially, before I implemente the custom ones, so I think that the GunicornPrometheusMetrics() or variants of it in wsgi.py works, but the problem is having both those default ones and the custom one exposed at the same port and by the same collector

charlienegri commented 1 month ago

mmh /src/gunicorn_prometheus_config.py' doesn't exist
I must have merged too fast on the other side, restarted the pod now

magnarem commented 1 month ago

yeah. you probably not need this copy statement. Since the gunicorn_prometheus_config.py resides in container/ folder together with the wsgi.py both files are copied to root, so you will not need this line COPY container/gunicorn_prometheus_config.py /src

and then the gunicorn command should work with:

CMD gunicorn -c gunicorn_prometheus_config.py --worker-class sync --workers 5 --bind 0.0.0.0:8000 wsgi:app --keep-alive 5 --log-level info

Also I see no problem with our custom metrics (the failed dist) you implemented is exposed on different port than the other prometheus flask metrics. You can have multiple annotations for scraping in kubernetes I imagine.

magnarem commented 1 month ago

Also for the 9200 Port, you will probably have to add a special ingress in the ingress.yml in the dmci base folder.

 rules:
    - host: "dmci.s-enda.k8s.met.no"
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: dmci
                port:
                  name: http
        - path:/flask-metrics
         pathType: Prefix?
         backend:
         service:
           name: dmci
           port:
             name: http-flask-metrics
         and then in statefulset

         ```
           ports:
        - name: http
          containerPort: 8000
          protocol: TCP
     - name: http-flask-metrics
       port: 9200
       protocol: TCP
       ``
charlienegri commented 1 month ago

the exposure of 9200 in the statefulset is already there, I had added it, removed it and added it back after last update
the endpoint seems also to have everything now, I am happy with this finally --> https://dmci.s-enda-dev.k8s.met.no/metrics

charlienegri commented 1 month ago

yeah. you probably not need this copy statement. Since the gunicorn_prometheus_config.py resides in container/ folder together with the wsgi.py both files are copied to root, so you will not need this line COPY container/gunicorn_prometheus_config.py /src

and then the gunicorn command should work with:

CMD gunicorn -c gunicorn_prometheus_config.py --worker-class sync --workers 5 --bind 0.0.0.0:8000 wsgi:app --keep-alive 5 --log-level info

Also I see no problem with our custom metrics (the failed dist) you implemented is exposed on different port than the other prometheus flask metrics. You can have multiple annotations for scraping in kubernetes I imagine.

gunicorn_prometheus_config.py is indeed in the root folder of the container so as you say we can skip the copy to src step

charlienegri commented 1 month ago

w̶h̶i̶l̶e̶ ̶l̶o̶o̶k̶i̶n̶g̶ ̶a̶t̶ ̶t̶h̶e̶ ̶f̶e̶d̶e̶r̶a̶t̶e̶d̶ ̶m̶e̶t̶r̶i̶c̶s̶ ̶I̶ ̶s̶e̶e̶ ̶s̶o̶m̶e̶ ̶i̶n̶c̶o̶n̶s̶i̶s̶t̶e̶n̶c̶i̶e̶s̶.̶.̶ ̶I̶ ̶f̶e̶a̶r̶ ̶t̶h̶e̶ ̶e̶x̶p̶o̶s̶u̶r̶e̶ ̶m̶i̶g̶h̶t̶ ̶h̶a̶p̶p̶e̶n̶ ̶o̶n̶ ̶o̶n̶e̶ ̶i̶n̶s̶t̶a̶n̶c̶e̶ ̶o̶n̶l̶y̶ ̶:̶/̶ ̶.̶.̶ ̶t̶h̶i̶s̶ ̶m̶i̶g̶h̶t̶ ̶b̶e̶ ̶d̶u̶e̶ ̶t̶o̶ ̶t̶h̶e̶ ̶u̶s̶e̶ ̶o̶f̶ ̶G̶u̶n̶i̶c̶o̶r̶n̶P̶r̶o̶m̶e̶t̶h̶e̶u̶s̶M̶e̶t̶r̶i̶c̶s̶ ̶i̶n̶s̶t̶e̶a̶d̶ ̶o̶f̶ ̶G̶u̶n̶i̶c̶o̶r̶n̶I̶n̶t̶e̶r̶n̶a̶l̶P̶r̶o̶m̶e̶t̶h̶e̶u̶s̶M̶e̶t̶r̶i̶c̶s̶?̶ ̶s̶e̶e̶ ̶h̶t̶t̶p̶s̶:̶/̶/̶g̶i̶t̶h̶u̶b̶.̶c̶o̶m̶/̶r̶y̶c̶u̶s̶8̶6̶/̶p̶r̶o̶m̶e̶t̶h̶e̶u̶s̶̶f̶l̶a̶s̶k̶̶e̶x̶p̶o̶r̶t̶e̶r̶/̶b̶l̶o̶b̶/̶m̶a̶s̶t̶e̶r̶/̶e̶x̶a̶m̶p̶l̶e̶s̶/̶g̶u̶n̶i̶c̶o̶r̶n̶-̶m̶u̶l̶t̶i̶p̶r̶o̶c̶e̶s̶s̶-̶1̶0̶9̶/̶s̶e̶r̶v̶e̶r̶.̶p̶y̶ ̶b̶u̶t̶ ̶i̶n̶ ̶m̶y̶ ̶p̶r̶e̶v̶i̶o̶u̶s̶ ̶a̶t̶t̶e̶m̶p̶t̶ ̶t̶o̶ ̶u̶s̶e̶ ̶t̶h̶a̶t̶ ̶t̶h̶a̶n̶ ̶w̶e̶ ̶w̶e̶r̶e̶ ̶m̶i̶s̶s̶i̶n̶g̶ ̶t̶h̶e̶ ̶f̶l̶a̶s̶k̶ ̶s̶t̶a̶n̶d̶a̶r̶d̶ ̶m̶e̶t̶r̶i̶c̶s̶ this might be correct after all, one instance only is consistent with the catalog-rebuilder-flask