After cloning the repository, run the following command inside the repository folder to install the contrastive_tda
package:
pip install -e .
Installing development hooks
make install_hooks
This will install several "hooks" - scripts that run automatically when certain git commands are run. These are generally useful for maintaining code quality and consistency.
To run the commands without the hooks, you can use the --no-verify
flag. For example, to commit without running the hooks, run the following command:
git commit --no-verify
DVC is used to track data and models. To add data to DVC, run the following command:
dvc add data/<data_file>
dvc push
The same way that git tracks changes to files, and stores copies of the files in a remote repository, DVC tracks changes to data files, and stores copies of the data files in a remote repo. To view remote repositories for this project, run the following command:
dvc remote list
Environment variables should be stored in a .env file in the root directory of the repository. The .env file should NOT be tracked by git. To add an environment variable, add a line to the .env file in the following format:
export COHERE_API_KEY = <api_key>
The ideal data processing script has
For example, the minimal data processing operation for the embed_reviews function is to embed a single review. in this case there should be a function called embed_review that takes a single review as an argument and returns the embedded review.
Some exceptions:
These exceptions are not rare, but they are the minority of cases.
Functions should generally start with a verb.
Examples of name changes I would suggest:
embedding_reviews
-> embed_reviews