Open YingxiaoKong opened 3 years ago
Hi Yingxiao,
Thanks for your question and apologies for the late reply!
Yes, the two models are trained separately. We first train a VAE on the task of reconstructing short windows (usually of a length of 24-144 samples). After the VAE is converged, then we train a LSTM to do one-step ahead prediction for a sequence of latent codes from the VAE.
After both models are trained, then we use them together for the anomaly detection.
The convergence of the system relies on the convergence of both models. There's no guarantee on the convergence will happen, as there's no guarantee of any neural network training, but it just usually happens. I think training both models together can also work. We didn't go for this route, as we think before VAE gets a sensible embedding, the prediction task for the LSTM is not really informative. So we think it's more efficient to get VAE works and let LSTM to predict on the stable patterns that VAE has derived.
Hope this clarifies your questions.
Shuyu,
That does help a lot and thank you so much for your reply! I'm going to try it on my problem and hopefully it will converge too!
Best, Yingxiao
From: Lin @.> Sent: Friday, 20 August 2021 9:45 AM To: lin-shuyu/VAE-LSTM-for-anomaly-detection @.> Cc: Kong, Yingxiao @.>; Author @.> Subject: Re: [lin-shuyu/VAE-LSTM-for-anomaly-detection] Are the vae and lstm models independent from each other? (#8)
Hi Yingxiao,
Thanks for your question and apologies for the late reply!
Yes, the two models are trained separately. We first train a VAE on the task of reconstructing short windows (usually of a length of 24-144 samples). After the VAE is converged, then we train a LSTM to do one-step ahead prediction for a sequence of latent codes from the VAE.
After both models are trained, then we use them together for the anomaly detection.
The convergence of the system relies on the convergence of both models. There's no guarantee on the convergence will happen, as there's no guarantee of any neural network training, but it just usually happens. I think training both models together can also work. We didn't go for this route, as we think before VAE gets a sensible embedding, the prediction task for the LSTM is not really informative. So we think it's more efficient to get VAE works and let LSTM to predict on the stable patterns that VAE has derived.
Hope this clarifies your questions.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flin-shuyu%2FVAE-LSTM-for-anomaly-detection%2Fissues%2F8%23issuecomment-902785275&data=04%7C01%7Cyingxiao.kong%40vanderbilt.edu%7C976d801f91d64706b6c808d963f17c00%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637650711088910161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=h8r4ifjYDy8kyV1pxZaHkypDaG5fYxJ2uqI9hVp%2BEXE%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIMGH4X75TMOWOA5U22WHKTT5Z2AFANCNFSM5A4XXPZQ&data=04%7C01%7Cyingxiao.kong%40vanderbilt.edu%7C976d801f91d64706b6c808d963f17c00%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637650711088910161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UdsFB196LoaTv9eaBfhxz%2Bjoe6JuZLsVme9K81FtJpo%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Cyingxiao.kong%40vanderbilt.edu%7C976d801f91d64706b6c808d963f17c00%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637650711088920115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XKAA5dODS2WBaeyzR5QnfP8UYj3dFEFYfceEAQpXsHQ%3D&reserved=0 or Androidhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email&data=04%7C01%7Cyingxiao.kong%40vanderbilt.edu%7C976d801f91d64706b6c808d963f17c00%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637650711088920115%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ge9ydZRp19TryqCbiflxfvYCpoJbh0SIwxl%2BpX%2BCZR8%3D&reserved=0.
Hi,
I have went through your codes and I'm still a little bit confused about the structures of theses two models (I'm not very good at reading codes, forgive me): are they independent from each other?
When I looked at the structure of the vae model, the input is the original slice window at time step t and the output is the reconstructed window at the same time step t, not the window at time step t+1, and there is no layer input or output from the lstm model. So I guess both of models are trained separately, after the vae is trained, the lstm utilizes the information from the vae and predicted the embedding window at the next time step. Is this correct? If this this correct, how to guarantee the convergence of these two models? As the lstm model relies on the performance of the vae model.
Another question is how to process the time series input to the encoder, here I saw you have used convolutional layer, I guess lstm also works?