Enable to collect parameter in T2 and SPE and reuse in the future for monitoring purpose (quality control chart context)

This PR is related to the Issue #15. Problem statement: To employ pca package as a monitoring method, in form of a quality control chart.

A control chart is simply a univariate chart, Ox axis is timestamp, Oy axis is sensory data. There is a set limit, such that any sensory value above it considered an outlier. T2 and SPE are multivariate chart that can be used in similar way. To enable it, one must store parameters have been learned in training dataset and apply those on new incoming data.

Changes I have made:

allow hotellingsT2() and spe_demodx() to use provided parameters and output those when compute from scratch.
modify compute_outliers() and fit_transform() accordingly to fit in the new format.

Code to test out the new change:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pca import pca

np.random.seed(42)
# Load dataset
n_total, train_ratio = 10000, 0.8
n_features = 10
my_array = np.random.randint(low=1, high=10, size=(n_total, n_features))
features = [f'f{i}' for i in range(1, n_features+1, 1)]
X = pd.DataFrame(my_array, columns=features)
X_train = X.sample(frac=train_ratio)
X_test = X.drop(X_train.index)

# Training
model = pca(n_components=5, alpha=0.5, n_std=3, onehot=False, normalize=True, random_state=42)
results, param_dict = model.fit_transform(X=X_train[features], row_labels=None, col_labels=None, verbose=3)
T2_train = np.log(results['outliers']['y_score'])
T2_mu, T2_sigma = T2_train.agg(['mean', 'std'])
T2_limit = T2_mu + T2_sigma*3

# Inference
PC_test = model.transform(X=X_test[features], row_labels=None, col_labels=None, verbose=3)
PC_test = np.array(PC_test)
scores, _ = model.compute_outliers(PC=PC_test, n_std=3, param_dict=param_dict, verbose=3) 
T2_test = np.log(scores['y_score'])

# Plot
plt.figure(figsize=(14, 4))
plt.axhline(T2_mu, color='blue')
plt.axhline(T2_limit, color = 'red', linestyle = 'dashed')
plt.scatter([i for i in range(T2_train.shape[0])], T2_train, c='black', s=100, alpha=0.5)
plt.scatter([i for i in range(T2_train.shape[0], T2_train.shape[0]+T2_test.shape[0], 1)], T2_test, c='blue', s=100, alpha=0.5)
plt.show()

erdogant / pca

Enable to collect parameter in T2 and SPE and reuse in the future for monitoring purpose (quality control chart context) #16