dfguan / purge_dups

haplotypic duplication identification tool
MIT License
209 stars 21 forks source link

Module runner error #83

Open OZTaekOppa opened 3 years ago

OZTaekOppa commented 3 years ago

Hello,

I am trying to test my plant genomes using purge_dups. After installing the program via Bioconda, I have managed to pass Step 1. Since Step 2 is optional, I did not modify anything but "lineage": "embryophyta". However, in Step 3, I have bumped into an issue.

FYI, An executed script in HPC PBSPro system:

conda activate purge_dups

Step 3: Use run_purge_dups.py to run the pipeline

/work/miniconda3/envs/purge_dups/scripts/run_purge_dups.py config.JulGM1K.PB.asm1.json src JulGM1K

conda deactivate

An error message: Traceback (most recent call last): File "/work/miniconda3/envs/purge_dups/scripts/run_purge_dups.py", line 3, in from runner.manager import manager ImportError: No module named 'runner'

I have looked at the previous issues (run without cluster, just local machine) but not much helpful.

Did I miss something? Any idea or suggestion?

Cheers,

Taek

dfguan commented 3 years ago

Hello Taek, run_purge_dups script is relying on my HPC runner for job submission, please run the following commands to install it:

git clone https://github.com/dfguan/runner.git
cd runner && python3 setup.py install --user

Best,

Dengfeng.

OZTaekOppa commented 3 years ago

Hi Dengfeng,

Thank you for your reply. I have tried again but no luck.

I am strongly suspicious about the HPC environment. According to the purge_dups manual (and runner/sys.config), it was tested for LSF and SLURM but not for PBSpro system. Unfortunately, I am using tBSpro system.

Despite this, I have followed your suggestion.

  1. Installed runner

git clone https://github.com/dfguan/runner.git

cd runner && python3 setup.py install --user

(base) @.***:/work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner> ll total 44 drwxrws--- 4 jungh5 MGRF_NGS 43 May 22 16:40 build -rw-rw----+ 1 jungh5 MGRF_NGS 816 May 22 20:19 config.iSLPBO1.PB.asm1.json drwxrws--- 2 jungh5 MGRF_NGS 36 May 22 16:40 dist -rw-rw---- 1 jungh5 MGRF_NGS 1073 May 22 16:39 example.py drwxrws--- 2 jungh5 MGRF_NGS 48 May 22 18:22 iSLPBO1.pri -rw-rw---- 1 jungh5 MGRF_NGS 1070 May 22 16:39 LICENSE -rwx------+ 1 jungh5 MGRF_NGS 1112 May 22 20:20 PurDups_SPBONT1st_CN1K_PBS.sh -rw-rw----+ 1 jungh5 MGRF_NGS 967 May 23 20:42 README.md drwxrws--- 4 jungh5 MGRF_NGS 4096 May 24 09:14 runner drwxrws--- 2 jungh5 MGRF_NGS 110 May 22 16:40 runner.egg-info -rw-rw---- 1 jungh5 MGRF_NGS 386 May 22 16:39 setup.py -rw------- 1 jungh5 default 1457 May 22 16:56 SL_PurDups.o9498207 -rw------- 1 jungh5 default 1301 May 22 18:07 SL_PurDups.o9499284 -rw------- 1 jungh5 default 86 May 22 18:27 SL_PurDups.o9499513 -rw------- 1 jungh5 default 739 May 23 18:14 SL_PurDups.o9500416 drwxrws--- 4 jungh5 MGRF_NGS 39 May 23 06:48 SPO_CN1KCt_RN

  1. Executed script

    !/bin/bash -l

    PBS -N SL_PurDups

    PBS -l walltime=150:00:00

    PBS -j oe

    PBS -l select=1:ncpus=12:mem=600GB

cd $PBS_O_WORKDIR

module load python/3.8.6-gcccore-10.2.0

Step 1: Use pd_config.py to generate a configuration file (/work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/purge_dups/scripts/pd_config.py -l iSLPBO1.pri -n config.iSLPBO1.PB.asm1.json /work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SPO_CN1KCt_RN.fasta /home/jungh5/KRIBB_QUT/SPMA_DOBR/SLPBONT_FA/SLPBONT1st.fasta

Step 3: Use run_purge_dups.py to run the pipeline

  1. Generated foler(s) After #Step 1 (Use pd_config.py to generate a configuration file), “SPO_CN1KCt_RN” folder was generated. However, these two sub-folders were empty.

Outcome: (base) @.***:/work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/SPO_CN1KCt_RN> ll total 0 drwxrws--- 2 jungh5 MGRF_NGS 6 May 23 06:48 coverage drwxrws--- 2 jungh5 MGRF_NGS 6 May 23 06:48 split_aln

calculate coverage and self-alignment File not found CMD: ['bsub', '-K', '-q', 'normal', '-M', '5000', '-n', '1', '-R"select[mem>5000] rusage[mem=5000] span[hosts=1]"', '-J', 'split_iSLPBO1', '-o', 'SPO_CN1KCt_RN/split_aln/splitiSLPBO1%J.o', '-e', 'SPO_CN1KCt_RN/split_aln/splitiSLPBO1%J.e', 'src/split_fa /work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/iSLPBO1.pri/SPO_CN1KCt_RN.fasta > SPO_CN1KCt_RN/split_aln/SPO_CN1KCt_RN.split.fa'] command src/split_fa /work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/iSLPBO1.pri/SPO_CN1KCt_RN.fasta > SPO_CN1KCt_RN/split_aln/SPO_CN1KCt_RN.split.fa failed, return code: 1 purge duplicates PBS Job 9500416.pbs CPU time : 11:15:29 Wall time : 11:26:29 Mem usage : 629145600kb

  1. Another try:

(base) @.***:/work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/runner> ll total 36 -rw-r----- 1 jungh5 MGRF_NGS 822 May 23 21:52 config.iSLPBO1.PB.asm1.json -rw-rw---- 1 jungh5 MGRF_NGS 8012 May 22 16:39 hpc.py -rw-rw---- 1 jungh5 MGRF_NGS 0 May 22 16:39 init.py drwxr-s--- 2 jungh5 MGRF_NGS 48 May 23 21:47 iSLPBO1.pri -rw-rw---- 1 jungh5 MGRF_NGS 6862 May 22 16:39 manager.py -rwx------+ 1 jungh5 MGRF_NGS 1112 May 23 22:13 PurDups_SPBONT1st_CN1K_PBS.sh -rw------- 1 jungh5 default 86 May 23 21:52 SL_PurDups.o9501786 -rw------- 1 jungh5 default 753 May 24 00:58 SL_PurDups.o9501977 drwxrws--- 4 jungh5 MGRF_NGS 39 May 23 22:13 SPO_CN1KCt_RN

However, it is still the same error.

calculate coverage and self-alignment File not found CMD: ['bsub', '-K', '-q', 'normal', '-M', '5000', '-n', '1', '-R"select[mem>5000] rusage[mem=5000] span[hosts=1]"', '-J', 'split_iSLPBO1', '-o', 'SPO_CN1KCt_RN/split_aln/splitiSLPBO1%J.o', '-e', 'SPO_CN1KCt_RN/split_aln/splitiSLPBO1%J.e', 'src/split_fa /work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/runner/iSLPBO1.pri/SPO_CN1KCt_RN.fasta > SPO_CN1KCt_RN/split_aln/SPO_CN1KCt_RN.split.fa'] command src/split_fa /work/MGRF_NGS/SPMA_DOB_ASM/CN1K/PBONT1st/SLPurDups/runner/runner/iSLPBO1.pri/SPO_CN1KCt_RN.fasta > SPO_CN1KCt_RN/split_aln/SPO_CN1KCt_RN.split.fa failed, return code: 1 purge duplicates PBS Job 9501977.pbs CPU time : 02:43:31 Wall time : 02:44:20 Mem usage : 629145600kb

Do you have any idea or suggestion on this matter?

Looking forward to your reply!

Regards,

Taek

From: Dengfeng Guan @.> Sent: Saturday, 22 May 2021 12:16 PM To: dfguan/purge_dups @.> Cc: Hyungtaek Jung @.>; Author @.> Subject: Re: [dfguan/purge_dups] Module runner error (#83)

Hello Taek, run_purge_dups script is relying on my HPC runner for job submission, please run the following commands to install it:

git clone https://github.com/dfguan/runner.git

cd runner && python3 setup.py install --user

Best,

Dengfeng.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dfguan/purge_dups/issues/83#issuecomment-846334873, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AS73UO2MTB3UTM75GPI3KDTTO4HXJANCNFSM4326OX6A.