Add basic installation test - Githubissues

rotary-genomics / spokewrench

Toolkit for manipulating circular DNA sequence elements

BSD 3-Clause "New" or "Revised" License

0 stars 0 forks source link

Add basic installation test #16

Closed jmtsuji closed 3 months ago

jmtsuji commented 3 months ago

Basic GitHub Actions workflow to install spokewrench in a conda env. Simplified version of #15 .

jmtsuji commented 3 months ago

@LeeBergstrand I've made a new branch that does not include the extraneous test files from #15 but contains the basic GitHub Actions framework I set up. For now, GitHub Actions just checks that spokewrench can be installed (along with dependencies) and doesn't do any end-to-end tests. I will work on end-to-end tests for rotate later on, but for now, do you think this might be enough progress to merge this PR into develop? If so, I'd be happy to see your review of this PR. Thanks.

LeeBergstrand commented 3 months ago

@jmtsuji I would change this so it tests multiple python versions rather than only 3.9. That way it we can test compatibility with multiple versions. See my comment for details.

jmtsuji commented 3 months ago

@LeeBergstrand Thanks for the feedback. I agree it would be nice to test multiple Python versions. The key issue is that we already lock down the Python version in environment.yml, which is used to build the conda env during install. I suspect the conda env file would override the Python version value(s) we set in the workflow, although I am not 100% sure about this. (I assume the workflow file will set up the Python version used by the runner, but then the runner will run conda env create --file=environment.yml or somthing similar.) We might need to change environment.yml to allows for a range of Python versions in order to test multiple versions. What are your thoughts?

LeeBergstrand commented 3 months ago

@LeeBergstrand Thanks for the feedback. I agree it would be nice to test multiple Python versions. The key issue is that we already lock down the Python version in environment.yml, which is used to build the conda env during install. I suspect the conda env file would override the Python version value(s) we set in the workflow, although I am not 100% sure about this. (I assume the workflow file will set up the Python version used by the runner, but then the runner will run conda env create --file=environment.yml or somthing similar.) We might need to change environment.yml to allow for a range of Python versions in order to test multiple versions. What are your thoughts?

@jmtsuji

We usually don't use environment.yaml to install a single Python application in production. Environment files are used as a stop-gap measure or for building a custom environment consisting of multiple pieces of software. Once the package is fully developed, Conda will use the data in meta.yaml instead to build and install the package. PIP uses pyproject.toml, and Conda's equivalent is meta.yaml. Ultimately, we could delete environment.yaml once we get the spokewrench package on Anaconda.org.
Usually, I will use more specific package versions in environment.yaml than meta.yaml. The meta.yaml tells Conda what versions are necessary to install the package alongside other Python software inside a single environment.
Generally, I don't lock to a specific Python version unless I am in the process of building a super complex pipeline like QIIME2 or Rotary. I generally provide a range of Python versions that work with the code with a low probability of errors. This increases compatibility with other software installed in the same Conda environment. For example, what if we want spokewrench to be installed alongside other software in the same Conda environment, but that piece of software requires Python 3.10? The installation would break... Providing a range of compatible Python versions solves this problem, as there will likely be an overlap between the package versions of each software.
Generally, I follow this design pattern.

I lock all the package versions during initial development to versions that are guaranteed to work.
I write unit tests and end-to-end tests.
I unlock the package versions and let the CI test the software with multiple Python versions.
The CI (https://en.wikipedia.org/wiki/Continuous_integration) may fail at specific package and Python versions. I take note of these and write down how they fail.
I then either fix the errors or go back and lock the packages or Python versions in a specific range of versions.
I continually update the package versions as new versions come out, using the CI system for tests.

LeeBergstrand commented 3 months ago

@jmtsuji There's a Github service that will run a bot that will update your dependancies automatically if you have CI tests in place: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates. I don't say we nessarily use this right now. But this is where things go when you go into full production.

LeeBergstrand commented 3 months ago

@jmtsuji I recommend that we merge this in right now and then test with other versions of Python in the future. Perhaps when your done writing the unit tests. Can you create an issue for testing with multiple Python versions?

LeeBergstrand commented 3 months ago

Generally, I don't lock to a specific Python version unless I am in the process of building a super complex pipeline like QIIME2 or Rotary. I generally provide a range of Python versions that work with the code with a low probability of errors. This increases compatibility with other software installed in the same Conda environment. For example, what if we want spokewrench to be installed alongside other software in the same Conda environment, but that piece of software requires Python 3.10? The installation would break... Providing a range of compatible Python versions solves this problem, as there will likely be an overlap between the package versions of each software.

This is one of the reasons software rots over time and needs continual maintenance. It's not that the underlying software stops working. It's that the ecosystem of software around it grows to become incompatible. The software needs to be continually updated to stay relevant.

jmtsuji commented 3 months ago

Generally, I follow this design pattern...

@LeeBergstrand This is very helpful background -- thanks. I wasn't aware of the difference between the meta.yaml used for conda installs and the environment.yaml we are using during development. Starting with fixed dependency versions in the environment.yaml and then making a more generic meta.yaml that can be tested with the CI is a smart idea. Overall, it sounds like I jumped the gun a bit by trying to get started on the CI at this stage. Probably better to start with unit tests and software development (with some basic local test) before working with the CI, correct? For now, though, just to keep a record of the rough Actions workflow I made for future CI work, I'll go ahead and merge this into develop like you mentioned. I'll also make an issue related to testing more Python versions in future. Really appreciate you laying out your typical development / software engineering workflow. Learning lots.