mrvollger / SDA

Segmental Duplication Assembler (SDA).
MIT License
44 stars 6 forks source link

Default Perl version is used instead of the custom cluster Perl module #18

Closed SergejN closed 1 year ago

SergejN commented 3 years ago

oh shoot, I have another one, I completely forgot about. On our cluster, perl is installed as a module. Nevertheless, there is a local version of perl (slightly outdated and without any additional libraries) that comes with the OS. However, when I run SDA, it seems to always fall back to the default perl version. I didn't quite figure out why, though... I just had to use a 'dirty' hack to make it work: I added ml perl/5.28.1-gcccore-8.2.0 to envs/env_python3.sh. I checked RepeatMasker itself and there, perl is called correctly as #!/usr/bin/env perl both in RepeatMasker and Taxonomy.pm. Right now, I don't have a clue what causes this strange behavior, but thought it's worth reporting. It could be the cluster or our anaconda installation..

Cheers, Sergej

mrvollger commented 3 years ago

Unlike the other dependencies I was not able to make a good conda env for repeatmasker. Therefore Repeatmasker and it's dependencies must be loaded in env_sda.sh, in fact you can see in the readme that is where I module load perl.

SergejN commented 3 years ago

I think I figured it out. I do load the perl module in env_sda.sh.

#!/bin/bash

unset PYTHONPATH
ml anaconda3/2019.10
ml perl/5.28.1-gcccore-8.2.0
ml gcc/7.3.0-2.30
ml python/3.6.6-foss-2018b
ml samtools/1.10-foss-2018b

export PATH=$PATH:/software/RepeatMasker/latest/:/software/RMBlast/rmblast-2.10.0/bin

However, when I run those commands manually, I am able to reproduce the behavior.

https://github.com/mrvollger/SDA/blob/1fbe948f3d8cde6ae6b8c49b33f4220053755718/denovo_SDA.smk#L15

After login

$ perl -v
This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
(with 40 registered patches, see perl -V for more detail)

$ which perl
/usr/bin/perl

Then I load env_sda.sh and it loads the correct perl module:

source ../env_sda.sh

The following have been reloaded with a version change:
  1) gcccore/8.2.0 => gcccore/7.3.0     2) zlib/.1.2.11-gcccore-8.2.0 => zlib/.1.2.11-gcccore-7.3.0

$ perl -v

This is perl 5, version 28, subversion 1 (v5.28.1) built for x86_64-linux-thread-multi

However, after I run source envs/env_python3.sh:

$ source envs/env_python3.sh

The following have been reloaded with a version change:
  1) gcccore/7.3.0 => gcccore/8.2.0     2) zlib/.1.2.11-gcccore-7.3.0 => zlib/.1.2.11-gcccore-8.2.0

The following have been reloaded with a version change:
  1) gcccore/8.2.0 => gcccore/7.3.0     2) zlib/.1.2.11-gcccore-8.2.0 => zlib/.1.2.11-gcccore-7.3.0

(sda-python-3) $ perl -v

This is perl 5, version 26, subversion 2 (v5.26.2) built for x86_64-linux-thread-multi

Copyright 1987-2018, Larry Wall

(sda-python-3) $ which perl
/projects/current/tests/sda/SDA/envs/sda-python-3/bin/perl

As far as I can tell, the path to perl is overwritten by the local environment module. Hence, it's rather an issue caused by our local environment and software modules. But it might be of interest for others if they encounter similar issues. RepeatMasker actually fails because it cannot find the module Text::Soundex. If that can be installed in the environment, then everything should work fine.

mrvollger commented 3 years ago

Got it, and I think this is good to leave up. I might have an idea to fix it as well but that will have to come later.