zc8340311 / RobustAutoencoder

A combination of Autoencoder and Robust PCA
MIT License
183 stars 68 forks source link

getRecon function from l21RobustDeepAutoencoder make prediction on fixed L/LD part #7

Closed robroe-tsi closed 5 years ago

robroe-tsi commented 5 years ago

Hi,

I have one question regarding the inference/prediction method "getRecon" (RobustAutoencoder/model/l21RobustDeepAutoencoder.py or RobustAutoencoder/experiments/Outlier Detection/RDAE_tensorflow.py):

This method looks like

def getRecon(self, X, sess):
        return self.AE.getRecon(self.L, sess = sess)

So it will create the reconstruction using the trained autoencoder on the "cleaned" X which is L (or LD in the paper). But shouldn't that be

def getRecon(self, X, sess):
        return self.AE.getRecon(X, sess = sess)

? Here we use the full X dataset. That way we can use the trained robust autoencoder to make inference on new data (test, validation data). The resulting reconstruction can be compared to the input and we can define any anomaly score based on the difference of those two.

Best, Roberto

zc8340311 commented 5 years ago

Hi Roberto, Thanks for your detailed checking. Your comments make sense and I had similar thoughts before. Here is my dilemma. If I put X in the getRecon function, the reconstruction, as you say, will be reasonable to apply on the new data. But I think it is not faithful to the current training data. Robust Autoencoder is designed to find the cleaned feature after isolating the anomalies. So I think the truthful reconstruction should be getRecon from L which contains none anomalies. So when I write these codes, I use L. On the other hand, there is another voice inside me say that the L does already represent the cleaned data, so the getRecon should be changed to adapt to new data. Be truthful with you, I haven't update this repo for months. Thanks for your thoughtful comments which trigger me thinking these thing again. I think I probably will change to L as you suggest.

Sincerely, Chong

robroe-tsi commented 5 years ago

Hi Chong,

thanks for the fast response. Appreciate that you care about the code even after the paper is published. I understand a little bit your dilemma but want to advocate again into the directory of changing the code to use X instead of L/LD. For me there are two use cases for this robust autoencoder - first there is data cleaning/filtering. In this case I train the robust autoencoder on the raw data inside a data preprocessing step and use the LD (from your paper) data object I get from this as input into my modeling step (which then could be in principle any ML technique that benefits from a more clean data set). Here I don't need a getRecon method as the training already gives me the cleaned training data (LD) + any detected anomalies (especially if I use the instance based anomaly definition) (S). Second case is anomaly detection itself, which for me as a practitioner only makes sense if I can apply it to new/future (always unlabeled) data. In this case I use a score computed somehow on the difference between input and reconstruction. The benefit from your method here could come from a more stable autoencoder model compared to the one I would get using a Plain autoencoder.

Best, Roberto

BTW: Just one thought - as your algo threats learning the autoencoder and then penalty separately - I could in principle plug any sort of autoencoder architecture into your optimization algorithm. Not only feed-forward ones but also those based on LSTMs (for example for temporal/sequence data), etc.. What do you think?

zc8340311 commented 5 years ago

Hi Roberto, I made a mistake in the last comment that I said use "L", actually, I mean "X". Your support is really convincing. Thanks a lot. For your idea of using other autoencoder to replace the normal autoencoder, it is a very good idea. I have one continuous work that using sparse autoencoder as a replacement and also currently, I also work with someone that use variational autoencoder to replace the normal ones. In my perspective, the key idea, X=L+S is a framework, not just a model. So I think LSTM-based autoencoders would be another interesting direction. Thank you.

Sincerely, Chong