armgilles / vcub_keeper

Analyse de l'activité des stations Vcub dans la métropole de Bordeaux afin de détecter en amont les stations hors service
https://vcubwatcher.herokuapp.com/
MIT License
2 stars 0 forks source link

Clustering non supervisé de l'activité des stations #23

Closed armgilles closed 3 years ago

armgilles commented 4 years ago

Idées :

ML

Features reduction :

Process :

Iso forest :

armgilles commented 3 years ago

Le learning du ML ne doit pas prendre en compte la station lorsque celle-ci est HS (status = 0).

armgilles commented 3 years ago

Certaines stations ont vraiment peu d'activité (peu de prise de vélo) :

image

armgilles commented 3 years ago

How to :

from vcub_keeper.config import *
from vcub_keeper.reader.reader import *
from vcub_keeper.reader.reader_utils import filter_periode
from vcub_keeper.visualisation import *
from vcub_keeper.transform.features_factory import *
from vcub_keeper.ml.cluster import train_cluster_station, predict_anomalies_station
from vcub_keeper.ml.cluster_utils import load_model, export_model

# Lecture du fichier activité
ts_activity = read_time_serie_activity()

# Some features
ts_activity = get_transactions_in(ts_activity)
ts_activity = get_transactions_out(ts_activity)
ts_activity = get_transactions_all(ts_activity)
ts_activity = get_consecutive_no_transactions_out(ts_activity)

# Set an ID station
station_id = 109

For cluster learning by station :

clf = train_cluster_station(ts_activity, station_id=station_id)

# Export model
export_model(clf, station_id=station_id)

To predict anomalies :

clf = load_model(station_id=station_id)
station_pred = predict_anomalies_station(data=ts_activity, clf=clf, station_id=station_id)
# New column `anomaly ` : 1 is OK, -1 is an anomaly
armgilles commented 3 years ago

Utilisation de Méta model par-dessus l'algo de détection d'anomalie (Isolation Forest) : https://scikit-lego.netlify.app/meta.html#OutlierClassifier