Open Vignesh-Sampat opened 9 months ago
Hi @Vignesh-Sampat,
What do you mean exactly by data augmentation? Could you please give examples?
Data augmentation is a collection of methods employed to expand the size of a dataset by introducing modified versions of existing data or generating synthetic data based on existing samples. Employing on-the-fly data augmentation within the dataloader, with a probability value (e.g., P=0.5), would be more advantageous compared to the traditional method of modifying, saving, and subsequently adding to the existing dataset.
Below are the list of example functions that can be used to reduce the risk of overfitting.
Gaussian noise/Jittering:
def add_gaussian_noise(time_series, mean=0.0, stddev=1.0):
"""
Adds Gaussian noise to a time series.
Options:
time_series (array-like): A time series to which noise is added.
mean (float): The average value of the noise. Default is 0.0.
stddev (float): Standard deviation of noise. Default is 1.0.
Returns:
noisy_series (np.array): Time series with added noise.
"""
# Gaussian noise generation
noise = np.random.normal(mean, stddev, len(time_series))
# Adding noise to the original time series
noisy_series = time_series + noise
return noisy_series
augmented_time_series_data = add_gaussian_noise(time_series_data, mean=0.0, stddev=0.05)
Random Scaling
def add_scaling(time_series, scale_factor):
"""
Scales a time series by multiplying each element by scale_factor.
:param time_series: numpy array, time series to be scaled
:param scale_factor: the number by which all elements of the series will be multiplied
:return: numpy array, scaled time series
"""
scaled_time_series = time_series * scale_factor
return scaled_time_series
scale_factor = 0.48 augmented_time_series_data = add_scaling(time_series_data, scale_factor)
magnitude warping
def magnitude_warping(time_series, num_knots=4, warp_std_dev=0.2):
"""
Applies magnitude warping to a time series using cubic splines.
:param time_series: np.array, time series to distort
:param num_knots: int, number of control points for splines
:param warp_std_dev: float, standard deviation for distorting the values of control points
:return: np.array, distorted time series
"""
# Generating random spline knots within a time series
knot_positions = np.linspace(0, len(time_series) - 1, num=num_knots)
knot_values = 1 + np.random.normal(0, warp_std_dev, num_knots)
# Creating a Cubic Spline Function Through Knots
spline = CubicSpline(knot_positions, knot_values)
# Generating time indexes for a time series
time_indexes = np.arange(len(time_series))
# Applying distortion to a time series
warped_time_series = time_series * spline(time_indexes)
return warped_time_series
augmented_time_series_data = magnitude_warping(time_series_data, num_knots=5, warp_std_dev=0.45)
Time warping
def time_warping(time_series, num_operations=10, warp_factor=0.2):
"""
Applying time warping to a time series.
:param time_series: Time series, numpy array.
:param num_operations: Number of insert/delete operations.
:param warp_factor: Warp factor that determines the impact of operations.
:return: Distorted time series.
"""
warped_series = time_series.copy()
for _ in range(num_operations):
operation_type = random.choice(["insert", "delete"])
index = random.randint(1, len(warped_series) - 2)
if operation_type == "insert":
# Insert a value by interpolating between two adjacent points
insertion_value = (warped_series[index - 1] + warped_series[index]) * 0.5
warp_amount = insertion_value * warp_factor * random.uniform(-1, 1)
warped_series = np.insert(warped_series, index, insertion_value + warp_amount)
elif operation_type == "delete":
# Remove a random point
warped_series = np.delete(warped_series, index)
else:
raise ValueError("Invalid operation type")
return warped_series
augmented_time_series_data = time_warping(time_series_data, num_operations=20, warp_factor=0.25)
I am also looking for references/examples of how to add augmentations to trainings is darts. Unable to find anything. @Vignesh-Sampat were you able to solve this ? Thanks
Is there any way do include data augmentation strategies in the darts Timeseries object?