salesforce / CodeTF

CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
Apache License 2.0
1.46k stars 101 forks source link

Extract Code Embeddings #26

Open SasCezar opened 1 year ago

SasCezar commented 1 year ago

I'm interested in performing embeddings of source code files for measuring the semantic similarity of content, but I'm not sure which model and task are better suited for my case as there is no 'representation-learning' or 'feature-extraction' task mentioned.

Furthermore, in the positive case, would it be possible to use the model with the HuggingFace "feature-extraction" pipeline?