Extract Code Embeddings

I'm interested in performing embeddings of source code files for measuring the semantic similarity of content, but I'm not sure which model and task are better suited for my case as there is no 'representation-learning' or 'feature-extraction' task mentioned.

Furthermore, in the positive case, would it be possible to use the model with the HuggingFace "feature-extraction" pipeline?

salesforce / CodeTF

Extract Code Embeddings #26