AQ-AI / openaq-engine

http://www.aqai.xyz
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

MLflow running on http://3.82.115.10:5000/ but experiments cannot be logged #61

Closed ChristinaLast closed 4 months ago

ChristinaLast commented 11 months ago

So there is an MLFLOW instance running on http://3.82.115.10:5000/

Screenshot 2023-09-19 at 15 33 37

however there is nothing being logged to the online mlflow server. Ideally we have:

  1. Airflow schedule jobs and workflows
  2. workflows call functions that collect/store/extract data
  3. when that data comes into tabular fomat (e.g. in the machine learning pipeline realm) this is when MLflow starts logging parameters/artifacts
  4. training flows log datasets/models/paramters/metrics...
  5. airflow takes over to orchestrate model serving pipeline serving predictions to endpoint.

However, there is evidence some of it is working. e.g. /home/ec2-user/openaq-engine/3.82.115.10:5000/201015484053809956/139d9896058446dfb8213e5aa7b44915/meta.yaml is a path of metrics fromm modmel runs that are being logged in the correct directory (just that server isnt "picking up on that"

The place we want mlflow to complete (e.g. when we are in the more "classical" mlflow pipeline area.

Screenshot 2023-09-19 at 15 42 28
ChristinaLast commented 11 months ago

Also, it says that is is logging to S3 but the bucket hasnt been updated since January 2023: https://s3.console.aws.amazon.com/s3/buckets/openaq-mlflow-bucket?region=us-east-1&tab=objects

ChristinaLast commented 11 months ago
openaq-engine run-pipeline local_data models --pollutant pm25 --source openaq-api --country US 

use this command to test.

Also, for local env variables change MLFLOW_TRACKING_URI=https://localhost:5000

ChristinaLast commented 4 months ago

closed in #53