FlorentF9 / DeepTemporalClustering

:chart_with_upwards_trend: Keras implementation of the Deep Temporal Clustering (DTC) model
MIT License
219 stars 58 forks source link

Dimension Reduction #10

Closed muhmmadzs closed 2 years ago

muhmmadzs commented 3 years ago

Hi, Thank you very much for your implementation. While looking into the autoencoder architecture I didn't understand how this is doing dimension reduction. As the encoder is passing the same vector because the return sequence value is True. I am sorry If I didn't fully understand your implementation As I am working on an encoder to encode the time series data my data is somehow looking like 200 numbers of samples and each sample has 1500 points (acceleration signal from measurement ). (200,1500) I want to encode these 200 responses to latent space let say 2 or 4 dim. The output of the latent variable should look like this (2,1500). Can you help me out here on how to use this architecture? Zohaib

FlorentF9 commented 3 years ago

First, you must know that the dimension reduction will be done on the NUMBER OF VARIABLES of your MULTIVARIATE time series, not on the number of time steps (i.e. length of the series, 1500 in your example). In this model, the encoder is recurrent and outputs a sequence the same length than the input series. If you have only one-dimensional inputs (data set of size (200, 1, 1500) in your example), the dimension will not be reduced, but it will extract new features.

The dimension of the latent space is determined by the number of units in the 2nd BiLSTM layer, controlled by the parameter n_units. For example, you can use n_units = [50, 2] to obtain latent sequences of dimension (2, 1500), i.e. an encoded data set of size (200, 2, 1500).

muhmmadzs commented 3 years ago

Thank you for your reply! I got the idea

muhmmadzs commented 3 years ago

Hi, I just want to ask 1 more question. if I want to detect an anomaly in the MULTIVARIATE time series. Is it a good idea to use two autoencoders 1 for dimension reduction similar to this one for the NUMBER OF VARIABLES let say from Six variables to 1 and then the second Autoencoder for data compression and reconstruction for Anomaly Or there is any better way to do it let say in a single model. your response will be highly appreciated.

FlorentF9 commented 3 years ago

IMHO there is no reason to use two separate models. For example, a convolutional or recurrent AE can directly learn to reconstruct your multivariate series. I don't know what kind of anomalies you are looking for (time points inside each series? or whole series that are anomalies?), but in any case, you should adapt the architecture to the nature of your data and application (are the inputs the same length or not ? do you need online anomaly detection? etc).

FlorentF9 commented 3 years ago

Also, a length of 1500 is very long, especially for recurrent nets it will be computationally costly and difficult if you need to capture very-long-term dependencies. Maybe start with a fully convolutional architecture. And moreover, 200 samples seems very low... maybe a non-deep algorithm would be a better choice? or finding a different representation by incorporating more prior knowledge?

muhmmadzs commented 3 years ago

Thanks for your reply! Just to explain some of the things I am looking for offline anomaly detection and the anomaly is a whole series. Actually, my problem is related to damage detection. For Example, I am running a batch of the vehicle over a bridge and each vehicle has variable properties like weight, speed, suspension properties, etc. and the vehicle is instrumented with a number of sensors at each location in the vehicle in my case 4 location (measurement response is Acceleration, Angular Velocity in front side and similar to the backside of each vehicle). I trained the AE model for undamaged state and reconstruction error is used as damage detection. The main concern is also to look severity of the damage which is directly proportional to the reconstruction error of AE.

I manage to sort out this problem well with convolutional AE when I only consider a single sensor like this [number of sample, timesteps, 1].

I want to use all sensors simultaneously to extract features and look for damage detection and severity comparison. I am still not sure how to do it. I have adjusted the TAE.py according to my problem to reduce the encoder output to a single representation like [number of samples, timesteps, 4], [number of samples, timesteps, 1] but the model is not learning the underlying feature loss function and error is not reducing significantly. Regarding timesteps and the number of the sample, they can both be decreased and increased simultaneously.

FlorentF9 commented 3 years ago

For offline detection, I would try without the recurrent part first, as it is more difficult to train and tune. Convolutional nets can handle as many variables (called channels) as you want, so this should not be a limitation. You can reduce the temporal dimension using pooling and maybe a FC layer at the bottleneck to obtain a fixed-size vector embedding. You will find many papers on how to train 1D conv autoencoders this way. For example: [1500, 4] -- conv1D --> [1500, 4] -- pool1D (5) --> [300, 4] -- conv1D --> [300, 4] -- pool1D (5) --> [60, 4] -- flatten --> [240] -- FC -> [10] -- FC --> [240] --> DECODER (symmetric with deconv/upsampling). Here you will get a 10d encoded vector. Really just an example.