Closed dmacguigan closed 4 weeks ago
Unfortunately, I haven't seen that before (the previous issues were when we didn't have networkx listed as a dependency in the recipe). The issue is the interaction of slurm and your conda environment so outside of verkko itself. I would guess the conda environment is somehow not active on the compute node trying to run the verkko command, thus missing networkx. Perhaps your cluster admins could advise?
Thanks for the quick repley @skoren! I believe the problem is exactly what you described. I fixed it by including the full path to the python version included in my verkko conda environment. Might be worth mentioning this in the help documentation for cluster users.
verkko -d ${WD} \
--hifi ${HERRO_READS} \
--nano ${READS} \
--hic1 ${HIC1} \
--hic2 ${HIC2} \
--slurm \
--snakeopts '--use-conda --cluster "./slurm-sge-submit.sh {threads} {resources.mem_gb} {resources.time_h} {rulename} {resources.job_id} --partition=general-compute --account=tkrabben --qos=general-compute"' \
--python '/projects/academic/tkrabben/modules_KrabLab/easybuild/2023.01/software/Core/miniconda3/22.11.1-1/envs/verkko/bin/python' \
--perl '/projects/academic/tkrabben/modules_KrabLab/easybuild/2023.01/software/Core/miniconda3/22.11.1-1/envs/verkko/bin/perl' \
--spl-run 1 8 24 # default runtime for this step is 96 hours, need to shorten it for UB cluster
However, I have now encountered another error, which seems to be a bug in the fix_haplogaps.py
script.
Error executing rule processGraph on cluster (jobid: 6, external: 16135940, jobscript: /vscratch/grp-tkrabben/MacGuigan/genome_assemblies/Sander_vitreus_TJK-76/HERRO/verkko/.snakemake/tmp.b0p19ecu/verkko.processGraph.6.sh). For error details see the cluster log and the log files of the involved rule(s).
Exiting because a job execution failed. Look above for error message
(verkko) dmacguig@vortex-future:~/vscratch/MacGuigan/genome_assemblies/Sander_vitreus_TJK-76/HERRO/verkko/2-processGraph$ tail process.err
mend <446 >28439
mend <4587 <18523
mend <5313 >10338
mend >20820 >23078
mend >14466 >23294
mend >20567 >20568
Traceback (most recent call last):
File "/projects/academic/tkrabben/modules_KrabLab/easybuild/2023.01/software/Core/miniconda3/22.11.1-1/envs/verkko/lib/verkko/scripts/fix_haplogaps.py", line 122, in <module>
sys.stderr.write("can't fix " + key[0] + " " + key[1] + " due to overlap containing node (wanted " + str(wanted_gap_length) + ", node lengths " + len(node_seqs[key[0][1:]]) + ", " + len(node_seqs[key[1][1:]]) + ")")
TypeError: can only concatenate str (not "int") to str
After commenting out line 122 in fix_haplogaps.py
, the pipeline proceeds as expected.
Thanks, I was going to add this to the documentation: "If you're using conda, you may need to make the conda-installed python your default before running.", do you think that is clear?
Thanks for the catch on the fix_haplogaps.py script, that does look like a bug. I think if those ints (the calls to len) were wrapped in str() it should also work but either way is OK since skipping the print won't affect functionality.
No, thank you for this exciting pipeline!
That clarification looks good, but maybe this would be a bit more specific? "If you're using conda (especially in a cluster environment), you may need to make the conda-installed python your default. You can do this with the --python
option when calling verkko
."
OK updated, I left out the cluster part since I put it in the section on running on a grid. I suspect on a single node the conda environment would already be loaded.
Hello,
I am running verkko v.2.1 from within a conda environment that I build using
conda create -n verkko -c conda-forge -c bioconda -c defaults verkko
. I'm running verkko on a computing cluster with the SLURM job scheduler.Unfortunately, I encountered the following error.
This error has been mentioned few times in the verkko Issues page. And it does seem that networkx is installed in my verkko conda environment. But somehow, the processGraph step can't see the module.
Any suggestions on how to proceed?
Thank you, Dan