I've added using pyenv and Poetry to manage the project's Python virtual environment and dependencies. They help with replicating the environment effortlessly, which will be useful when running tasks on the cluster.
After you install these two programs, you just need to run the following commands to set up the environment:
pyenv install 3.9.12
pyenv virtualenv 3.9.12 repeat_identification
# (verify that the repeat_identification environment is activated by running `pyenv version`)
poetry install
The second change is that I realized that all repeats families can quickly be downloaded from Dfam before starting downloading annotations, which speeds up that process. (And is saved locally for subsequent runs.)
I also formatted the code with Black (good code formatter for Python, handy to use).
Let me know in the comments or email about anything.
I've added using pyenv and Poetry to manage the project's Python virtual environment and dependencies. They help with replicating the environment effortlessly, which will be useful when running tasks on the cluster.
Here's their installation instructions, let me know if you have any trouble with installing them: https://github.com/pyenv/pyenv#installation https://python-poetry.org/docs/master/#installing-with-the-official-installer
After you install these two programs, you just need to run the following commands to set up the environment:
The second change is that I realized that all repeats families can quickly be downloaded from Dfam before starting downloading annotations, which speeds up that process. (And is saved locally for subsequent runs.)
I also formatted the code with Black (good code formatter for Python, handy to use).
Let me know in the comments or email about anything.