mlops-for-all / mlops-for-all.github.io

169 stars 38 forks source link

[Chapter 2] setup mlflow #28

Closed anencore94 closed 2 years ago

anencore94 commented 2 years ago

resolves #15 related #19

anencore94 commented 2 years ago

local 에 kubectl port-forward 로 mlflow 는 5000, minio 는 9000 포트로 연결한 뒤, 다음과 같은 mlflow sdk 를 사용한 소스코드를 돌려 정상적으로 동작함을 확인하였습니다.

from pprint import pprint

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

import mlflow
from utils import fetch_logged_data

def main():
    mlflow.set_tracking_uri("http://localhost:5000")
    # print(mlflow.get_artifact_uri())

    # enable autologging
    mlflow.sklearn.autolog()

    # prepare training data
    X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
    y = np.dot(X, np.array([1, 2])) + 3

    # train a model
    pipe = Pipeline([("scaler", StandardScaler()), ("lr", LinearRegression())])
    with mlflow.start_run() as run:
        pipe.fit(X, y)
        print("Logged data and model in run: {}".format(run.info.run_id))

    # show logged data
    for key, data in fetch_logged_data(run.info.run_id).items():
        print("\n---------- logged {} ----------".format(key))
        pprint(data)

if __name__ == "__main__":
    import os

    os.environ["AWS_ACCESS_KEY_ID"] = "minio"
    os.environ["AWS_SECRET_ACCESS_KEY"] = "minio123"
    os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://localhost:9000"
    main()

local 에서 AWS_ACCESS_KEY_ID 와 같은 env var 를 설정해주어야 함에 주의바랍니다.

Aiden-Jeon commented 2 years ago

local 에 kubectl port-forward 로 mlflow 는 5000, minio 는 9000 포트로 연결한 뒤, 다음과 같은 mlflow sdk 를 사용한 소스코드를 돌려 정상적으로 동작함을 확인하였습니다.

from pprint import pprint

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

import mlflow
from utils import fetch_logged_data

def main():
    mlflow.set_tracking_uri("http://localhost:5000")
    # print(mlflow.get_artifact_uri())

    # enable autologging
    mlflow.sklearn.autolog()

    # prepare training data
    X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
    y = np.dot(X, np.array([1, 2])) + 3

    # train a model
    pipe = Pipeline([("scaler", StandardScaler()), ("lr", LinearRegression())])
    with mlflow.start_run() as run:
        pipe.fit(X, y)
        print("Logged data and model in run: {}".format(run.info.run_id))

    # show logged data
    for key, data in fetch_logged_data(run.info.run_id).items():
        print("\n---------- logged {} ----------".format(key))
        pprint(data)

if __name__ == "__main__":
    import os

    os.environ["AWS_ACCESS_KEY_ID"] = "minio"
    os.environ["AWS_SECRET_ACCESS_KEY"] = "minio123"
    os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://localhost:9000"
    main()

local 에서 AWS_ACCESS_KEY_ID 와 같은 env var 를 설정해주어야 함에 주의바랍니다.

여기서부터 env지옥이 시작되는군요..