Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing
3 stars 0 forks source link

mlflow stopped responding #27

Closed RuiFilipeCampos closed 8 months ago

RuiFilipeCampos commented 8 months ago

I'm not entirely sure what happened

gonna deploy a more robust solution:

on the side of the the gpu machine, I might code a worker to offload the metrics logging to a separate process, but hopefully having them in the same subnet will decrease latency to acceptable levels

RuiFilipeCampos commented 8 months ago

https://www.restack.io/docs/mlflow-knowledge-mlflow-dataset-management