Open flyingtiger111 opened 1 week ago
The cross-reconstruction is motivated from the insights of positive-pair contrastive learning, such as BYOL and SimSiam. In contrastive learning, a predictor is used to predict the feature of one view, given the input feature of another view, which can be seen as cross-reconstruction. Without cross-reconstruction, contrative learning degrade to self-prediction where the predictor can just learn to output its input. Therefore, this paradigm can help to prevent the decoder (predictor) from the "identity mapping" phenomenon.
identity mappin
Could you explain how config:E addresses the 'identical shortcut' problem from the perspective of anomaly detection? Or did you solve this problem simply by mimicking the paradigm of contrastive learning (such as SimSiam)?
I kowe the encoder which is adapted to the target image domain can generates feature representations in a domain-specific view and the frozen encoder can provide the view from pre-train image domain.
However, I am still unclear about the purpose of cross-reconstruction and the benefits it offers.
Could you provide more details and insights?