tilman151 / rul-datasets

A collection of datasets for RUL estimation as Lightning Data Modules.
https://krokotsch.eu/rul-datasets
40 stars 2 forks source link

Different data processing in C-MAPSS and N-C-MAPSS #66

Closed lurenyi233 closed 1 week ago

lurenyi233 commented 1 week ago

Dear author,

Thanks a lot for your great work! I noticed that the sliding time window approach is applied to generate samples for C-MAPSS dataset, while a different way is used for N-C-MAPSS (one sample for each flight cycle) in this project.

However, the author of N-C-MAPSS published another paper showing that they also apply the sliding time window approach (TW=50) for N-C-MAPSS. There will be many more samples in this way: Chao, M. A., Kulkarni, C., Goebel, K., & Fink, O. (2022). Fusing physics-based and deep learning models for prognostics. Reliability Engineering & System Safety, 217, 107961.

Is it possible to create a unified way to process the data? I feel that it will help a lot in this field. Here is a project applying the sliding time window approach for N-C-MAPSS. https://github.com/Adithya-Thonse/N-CMAPSS_DL

Best

tilman151 commented 1 week ago

Dear @lurenyi233,

I left the windowing out for N-C-MAPSS because there were multiple ways of processing the data in the literature. Not all of them needed sliding windows.

For my own research, I used the option to specify a manual feature_extractor when constructing an RulDataModule (Docs). The following extractor will produce sliding windows without consuming any more memory:

class WindowExtractor:
    def __init__(self, upper_window, lower_window):
        self.upper_window = upper_window
        self.lower_window = lower_window

    def __call__(self, features, targets):
        num_channels = features.shape[-1]
        num_slices = features.shape[1] // self.lower_window
        last_idx = num_slices * self.lower_window
        reshaped = features[:, :last_idx].reshape(-1, self.lower_window, num_channels)
        reshaped = np.pad(reshaped, ((0, num_slices - 1), (0, 0), (0, 0)), mode="empty")

        window_shape = (self.upper_window * num_slices, self.lower_window, num_channels)
        features = np.lib.stride_tricks.sliding_window_view(reshaped, window_shape)
        features = features[:, 0, 0, ::num_slices]

        window_cutoff = (self.upper_window - 1) * num_slices
        targets = targets.repeat(num_slices)[window_cutoff:]

        return features, targets

The resulting windows slide over the flight cycles and tumble over the time steps of the cycles. Please be aware when windowing, that not all flight cycles are of the same length and therefore padded.

lurenyi233 commented 1 week ago

Dear author,

Thanks a lot for your comments. I agree with your point. I found some other papers that may apply a processing method similar to yours. As shown in the figure below, there is only one sample for each flight cycle. Wang, H., Zhang, Z., Li, X., Deng, X., & Jiang, W. (2023). Comprehensive dynamic structure graph neural network for aero-engine remaining useful life prediction. IEEE Transactions on Instrumentation and Measurement. image

I found another project that specified the detailed setting if you want to apply the sliding time window approach to generate samples for N-C-MAPSS, here is the link for your reference:

I suggest to close the issue. Hope our discussion helps if some other guys have similar questions.

Best