Closed Cesuuur closed 2 years ago
In case you ever come back here ... I have seen an error that has to do with labels. In app.py:
# A gauge set for the predicted values
GAUGE_DICT = dict()
for predictor in PREDICTOR_MODEL_LIST:
unique_metric = predictor.metric
label_list = list(unique_metric.label_config.keys())
label_list.append("value_type")
if unique_metric.metric_name not in GAUGE_DICT:
GAUGE_DICT[unique_metric.metric_name] = Gauge(
unique_metric.metric_name + "_" + predictor.model_name,
predictor.model_description,
label_list,
)
Here you initialize the Gauge, and you pass a label list for initialize the class, but there is one error, you make a Gauge class for each metric, but you only store the set of labels for one of the series. This end in the 500: internal Error that I commented above when Prometheus does HTTP Get.
# Check for all the columns available in the prediction
# and publish the values for each of them
for column_name in list(prediction.columns):
GAUGE_DICT[metric_name].labels(
**predictor_model.metric.label_config, value_type=column_name
).set(prediction[column_name][0])
It's easy to solve, I've done a function that set all the labels for all the series from one metric.
def all_labels(unique_metric, label_list):
# global GAUGE_DICT
# Si es una nueva métrica inicializamos la lista
if unique_metric.metric_name not in GAUGE_DICT:
label_list = list(unique_metric.label_config.keys())
label_list.append("value_type")
# si no recorremos todo el set de etiquetas (nueva serie, pero misma métrica) y añadimos las que no tengamos ya guardadas
else:
for label in list(unique_metric.label_config.keys()):
if label not in label_list:
label_list.append(label)
return label_list
....
....
....
# A gauge set for the predicted values
GAUGE_DICT = dict()
label_list = list()
for predictor in PREDICTOR_MODEL_LIST:
unique_metric = predictor.metric
label_list = all_labels(unique_metric, label_list)
if unique_metric.metric_name not in GAUGE_DICT:
GAUGE_DICT[unique_metric.metric_name] = Gauge(
unique_metric.metric_name + "_" + predictor.model_name,
predictor.model_description,
label_list,
)
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
/close
@sesheta: Closing this issue.
Hello I'm having trouble integrating the image into my platform (Kubernetes). I have been doing a lot of tests and I have not drawn any conclusions. I'm pretty sure the error has to do with the labeling of my metrics.
At first I had the same error that appears in this issue 500: Internal Server Error and that is due to the need for this check in prometheus_client :
At the beginning it initializes with a series of labels that in each new request it checks that they match. So, you need to pass all the metric labels to it even if you don't want to specify them. And that i have done, I've done the test with a couple of metrics, and each one has given me a result (I wanted to run the app.py locally, but there was no way, I tried with several virtual environments and python 3.8 but this issue passed me Getting AttributeError: Can't pickle local object 'BaseAsyncIOLoop.initialize..assign_thread_identity' error )
Here is the configuration:
And here the result I get:
(These logs are testing with 15 days rolling_window and 15m retrain) I have tried to manually do the http get ... /metrics that prometheus would do, and I verify that it does not send the forecasts, only metadata like:
It seems that the model is not trained. And with the other metric, the opposite happens, the model is trained, but it seems that the tornado server never gets to initialize.
up metric is a dumb metric, because is 1 or 0, but I was just trying to see what was wrong.
As I said before, here it trains the model, all the time but it never starts the tornado server (These logs are with the configuration given above).
Thank you very much in advance. And by the way, quite a good project :)