petrobras / 3W

Promotes development of ML algorithms for early detection and classification of undesirable events in offshore oil wells.
Apache License 2.0
326 stars 79 forks source link

Feat/adapt dev to 3w dataset 2.0 #126

Open castrokelly opened 2 weeks ago

castrokelly commented 2 weeks ago

This pull request adapts the dev.py sub-module to ensure full compatibility with the 3W Dataset 2.0. The main changes include updating the EventFolds class to correctly handle the new data loading process and removing the redundant extrai_arrays() function.

Changes made:

Example usage:

The following code snippet demonstrates how to use the updated Experiment class with the 3W Dataset 2.0:

import toolkit as tk

# Create an experiment for the "SPURIOUS_CLOSURE_OF_DHSV" event
experiment = tk.Experiment(event_name="SPURIOUS_CLOSURE_OF_DHSV")

# Generate the folds for the experiment
folds = experiment.folds()

# Access the training and test samples for each fold
for fold in folds:
    X_train, y_train = fold.extract_training_samples()
    X_test = fold.extract_test_samples()

    # ... your machine learning model training and evaluation code here ...

Benefits:

This contribution significantly improves the usability and efficiency of the 3W Toolkit when working with the 3W Dataset 2.0, facilitating research and development of machine learning models for anomaly detection in oil wells.


By creating this pull request, I confirm that I have read and fully accept and agree with one of the Petrobras' Contributor License Agreements (CLAs):

Our CLAs are based on the Apache Software Foundation's CLAs: