What would be the recommend way to aggregate different semantic code embeddings from the same repository to represent the overall semantic of the repository? Currently I am averaging the UniXCoder embeddings and using cosine similarity score to evaluate the result, but I am not sure if it is the right way.
What would be the recommend way to aggregate different semantic code embeddings from the same repository to represent the overall semantic of the repository? Currently I am averaging the UniXCoder embeddings and using cosine similarity score to evaluate the result, but I am not sure if it is the right way.
Thanks in advance!