Open KSoumya opened 3 months ago
Also, put something like "fixes #3" in the PR description.
updated PR description
I just realized this is a new Snakefile in a subfolder. Why not add to our existing file?
I tried running and get this error:
Traceback (most recent call last):
File "/home/balhoff/test-rule/char-sim/snakemake_conda/.snakemake/scripts/tmplbi3nw0_.sample_script.py", line 5, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
I don't see torch in the env.yaml; should it be there?
I just realized this is a new Snakefile in a subfolder. Why not add to our existing file?
the subfolder is now removed and the existing Snakefile is updated with a new rule.
@KSoumya thanks for the updates; I am trying it out.
@KSoumya when I run I get this error:
Traceback (most recent call last):
File "/home/balhoff/test-rule/char-sim/.snakemake/scripts/tmp1a2ojqu_.create_train_data.py", line 8, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
I see that pandas is in environment.yaml, but under 'pip' rather than directly in 'dependencies'. What is the difference?
I see that pandas is in environment.yaml, but under 'pip' rather than directly in 'dependencies'. What is the difference?
The difference is only for how it gets installed (via pip from PyPi or via conda via a conda channel). The error suggests that you either didn't create the conda environment or that it isn't activated for the particular step that requires it.
@KSoumya @hlapp now that I actually have conda installed, this is working for me (I installed miniforge and mamba). But I needed to edit environment.yaml
. I initially got some conflicts which seemed to be between the version of snakemake I have (presumably one of the newest) and a very old version of python (3.8.19) that is specified in environment.yaml.
My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file. But as written it didn't work for me; maybe it would have if I had a specific version of conda or snakemake?
@KSoumya based on the shell snippet you sent me, I think your background environment may have more installed into it, rather than setting up the environment in the rule:
snakemake --cores 4 --use-singularity id12_desc12_simGIC.tsv.gz
I needed to use --use-conda
so that the environment was created when the rule was run:
snakemake -c4 --show-failed-logs --use-singularity --use-conda id12_desc12_simGIC.tsv.gz
Maybe this is why you didn't run into these issues in your own runs.
@KSoumya @hlapp now that I actually have conda installed, this is working for me (I installed miniforge and mamba). But I needed to edit
environment.yaml
. I initially got some conflicts which seemed to be between the version of snakemake I have (presumably one of the newest) and a very old version of python (3.8.19) that is specified in environment.yaml.My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file. But as written it didn't work for me; maybe it would have if I had a specific version of conda or snakemake?
@KSoumya based on the shell snippet you sent me, I think your background environment may have more installed into it, rather than setting up the environment in the rule:
snakemake --cores 4 --use-singularity id12_desc12_simGIC.tsv.gz
I needed to use
--use-conda
so that the environment was created when the rule was run:snakemake -c4 --show-failed-logs --use-singularity --use-conda id12_desc12_simGIC.tsv.gz
Maybe this is why you didn't run into these issues in your own runs.
@balhoff your snakemake command does entirely make sense, indeed --use-conda needs to be enabled. I will check how to make the environment.yaml more geeneric.
@KSoumya I also forgot to say—in the snakemake docs it says that without that flag, the conda environment property in a rule is entirely ignored.
@KSoumya I also forgot to say—in the snakemake docs it says that without that flag, the conda environment property in a rule is entirely ignored.
that's right, since I have the env defined and activated during the runs I didn't come across this requirement. Thanks for sharing this.
My intuition (without being familiar with snakemake/conda practices) would be that the environment should be specified in the most minimal way possible. But I'm not sure if this file is supposed to act as a statement of the direct dependencies or instead like a lock file.
It can work as either. In the form exported using conda env export
all versions are "locked". This is often desirable, as installing a later version for some dependency when run at a later time not only will result in a different environment, but can (and in practice often does) break code that's not forward compatible.
I do agree that Python 3.8.x is relatively old at this point, and that Python shouldn't need to be held at this version. That is, unless, I think, we're using Tensorflow 1. But thought we're using Torch, and 3.11 is generally supported by recent versions of TensorFlow, Torch, etc.
Fixes #3.