Clay-foundation / model

The Clay Foundation Model (in development)
https://clay-foundation.github.io/model/
Apache License 2.0
347 stars 44 forks source link

Save the losses when creating embeddings #207

Closed brunosan closed 4 months ago

brunosan commented 6 months ago

In the process of generating embeddings from a trained deep learning model, we perform inference on an input image and save the resulting embedding, which captures the semantic reconstructions of salient features as determined by the model. To enhance the utility of these embeddings, I propose saving the reconstruction loss alongside the embedding vector.

The reconstruction loss, calculated as the difference between the input image and the model's reconstructed output, provides valuable insights into the semantic content and anomalies present in the input image:

  1. Images with expected semantics that align well with the model's training data will exhibit a smaller reconstruction loss, indicating that the model can effectively capture and reconstruct the salient features.
  2. Images containing rare, unexpected, or anomalous semantics will result in a larger reconstruction loss, as the model may struggle to accurately reconstruct the input due to the presence of features outside its learned representation.

Real-world applications:

  1. Monitoring changes in satellite imagery: By comparing embeddings and reconstruction losses of a region (e.g., Kiev) before and after significant events (war), we can detect and quantify the extent of semantic changes. Pre-event images will likely have smaller losses, while post-event images containing destruction, damage, and other anomalies will have higher losses.
  2. Anomaly detection in various domains: The reconstruction loss can serve as a valuable metric for detecting anomalies, such as rare events (city floods, locust plagues), unusual semantics (algae blooms, green pools), or "noise" (fog, smog, ships in the ocean, image artifacts). By setting appropriate thresholds on the reconstruction loss, we can flag images containing such anomalies for further analysis.

Implementation: Modify the embedding generation pipeline to calculate and save the reconstruction loss alongside the embedding vector. This can be achieved by comparing the input image with the model's reconstructed output using e.g. the same loss, or others (MSE, ...).

Action Items:

I think this change no only makes our outputs much more useful, but also of measurable relative confidence, AND highlight of operational bias (making the loss a feature, not a thing to get rid of).

cc @MaceGrim @danhammer for the utility feedback.

brunosan commented 6 months ago

+1 on this based on customer needs to lean on embedding shift for anomaly detection. Not having the losses messess the value of the anomalies.

yellowcap commented 4 months ago

Reconstruction losses would require to run the reconstruction head during inference. So while this might be an indication of ease of reconstruction, it might not always indicate that the embedding is better. Simpler images are easier to reconstruct than complicated ones. So I am closing this for now, let's revisit when we want to dive deeper into this topic.