CodeReviewer is a tool that uses machine learning to help developers reviewing code. It is trained on a dataset of code from the hlxsites repositories and is able to predict how likely a given function is going to break in the future.
We save each functions first version(When it was first merged) and how often it was changed in the future. We then use this data to create embeddings for each function. We then use these embeddings to create a database using Qdrant. When a user uploads a function, we use the embeddings to find the 5 most similar functions in the database. We then use the number of times these functions were changed to predict how likely the user's function is to break in the future.
pip install -r requirements.txt
to install all dependencies all Python dependenciesnpm install
to install all Node dependenciesInstall Qdrant:
docker pull qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
../inputData
. python main.py
to run the whole app including, getting the data from the git repo, generating embeddings, creating the database, and running the query.
pyththon test_getFuntionData.py