AdaptiveMotorControlLab / CEBRA

Learnable latent embeddings for joint behavioral and neural analysis - Official implementation of CEBRA
https://cebra.ai
Other
875 stars 66 forks source link

Normalized data produces different embedding #131

Closed melissaxdu closed 5 months ago

melissaxdu commented 5 months ago

Is there an existing issue for this?

Bug description

Hello! I'm a student at MIT majoring in Computer Science and Neuroscience, and I'm currently working on a project at the Wilson Lab at the McGovern Institute using the CEBRA model to build embeddings for hippocampal data. We have some spike rate data that we normalized in different ways (applied different scaling factors) and trained embeddings for. Given that the information within each dataset is the same, we were expecting for the embeddings to look around the same as well. However, they turned out quite different (I've attached an image of the plotted embeddings for reference).

image

Somewhat relatedly, I was wondering if there are any proposed methods of evaluating the "goodness" of an embedding, or the degree of similarity between embeddings?

Operating System

Mac

CEBRA version

0.4.0

Device type

V100

Steps To Reproduce

https://drive.google.com/file/d/1Ys4Lp4m9lxM_XR_BaloYRk07vrmL_DfT/view?usp=sharing

Relevant log output

No response

Anything else?

No response

Code of Conduct

MMathisLab commented 5 months ago

Hi Melissa, these look quite similar, but as we describe in the paper and in the documentation you should quantify goodness of fit; you should look at the loss curves for the infoNCE values at the same point in training (see figure 2), and measure consistency (see figures 1 and 2). The visualization of embeddings is going to strongly depend on how you train it (again see figure 2).

I'll close this issue as it's not a bug, it's a discussion tab item 😊.