Gabaldonlab / perSVade

perSVade: personalized Structural Variation detection
GNU General Public License v3.0
36 stars 5 forks source link

BLAST Database creation error: mdb_env_open: Function not implemented #4

Closed lianglunping closed 1 year ago

lianglunping commented 2 years ago

Describe the issue

Sometimes,running withtest_installation_modules.py will generate the following errors:

[23/08/2022, 02:11:15] perSVade annotate_SVs finished correctly
Running module annotate_SVs...
BLAST Database creation error: mdb_env_open: Function not implemented
[23/08/2022, 02:13:06] Running perSVade find_homologous_regions into ./installation/test_installation/testing_outputs/find_hom_regions with 1820.795 Gb of RAM and 288 cores
[23/08/2022, 02:13:16] Getting blastn of genome against itself
Traceback (most recent call last):
  File "/perSVade/installation/test_installation/../../scripts/find_homologous_regions", line 128, in <module>
    blastn_file = fun.get_blastn_regions_genome_against_itself(opt.ref, opt.max_evalue, opt.query_window_size, opt.replace, opt.threads, max_query_windows=opt.max_n_query_windows)
  File "/perSVade/installation/test_installation/../../scripts/sv_functions.py", line 10836, in get_blastn_regions_genome_against_itself
    blast_df = blastn_query_against_subject(query_multifasta, reference_genome, blast_outfile, tmpdir, threads, replace, max_eval=max_eval)
  File "/perSVade/installation/test_installation/../../scripts/sv_functions.py", line 10787, in blastn_query_against_subject
    run_cmd("%s -in %s -dbtype nucl > %s"%(makeblastdb, database_chrom, makeblastdb_out), env=EnvName_RepeatMasker)
  File "/perSVade/installation/test_installation/../../scripts/sv_functions.py", line 691, in run_cmd
    if out_stat!=0: raise ValueError("\n%s\n did not finish correctly. Out status: %i"%(cmd_to_run, out_stat))
ValueError: 
source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env_RepeatMasker_env > /dev/null 2>&1 && /opt/conda/envs/perSVade_env_RepeatMasker_env/bin/makeblastdb -in ./installation/test_installation/testing_outputs/find_hom_regions/reference_genome_dir/reference_genome.fasta.generating_blast_against_itself/Ca22chr1A_C_albicans_SC5314/database.fasta -dbtype nucl > ./installation/test_installation/testing_outputs/find_hom_regions/reference_genome_dir/reference_genome.fasta.generating_blast_against_itself/Ca22chr1A_C_albicans_SC5314/database.fasta.generating.std
 did not finish correctly. Out status: 256
Running module find_homologous_regions...

ERROR!!! Out status: 256
Traceback (most recent call last):
  File "./installation/test_installation/test_installation_modules.py", line 169, in <module>
    fun.run_cmd("%s find_homologous_regions --threads %i --fraction_available_mem 1.0 -o %s --ref %s --min_chromosome_len 1000"%(fun.perSVade_modules, threads, outdir_homRegions, Calbicans_chr1_2_6))
  File "./installation/test_installation/../../scripts/sv_functions.py", line 691, in run_cmd
    if out_stat!=0: raise ValueError("\n%s\n did not finish correctly. Out status: %i"%(cmd_to_run, out_stat))
ValueError: 
source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env > /dev/null 2>&1 && /perSVade/installation/test_installation/../../scripts/perSVade find_homologous_regions --threads 288 --fraction_available_mem 1.0 -o ./installation/test_installation/testing_outputs/find_hom_regions --ref ./installation/test_installation/testing_outputs/Candida_albicans_chr1_2_6.fasta --min_chromosome_len 1000
 did not finish correctly. Out status: 2560
ERROR conda.cli.main_run:execute(34): Subprocess for 'conda run ['python', '-u', './installation/test_installation/test_installation_modules.py']' command failed.  (See above for error)

Additional information

version

OS: Centos7 docker:mikischikora/persvade:v1.02.6

Full command line

docker run -v $PWD/perSVade_testing_outputs:/perSVade/installation/test_installation/testing_outputs mikischikora/persvade:<tag> python -u ./installation/test_installation/test_installation_modules.py

Would appreciate the help to solve this issue. Thank you.

MikiSchikora commented 2 years ago

Good afternoon,

For some reason the makeblastdb is failing, maybe due to this mdb_env_open. It would be useful for me to see the standard output of this command (in ./installation/test_installation/testing_outputs/find_hom_regions/reference_genome_dir/reference_genome.fasta.generating_blast_against_itself/Ca22chr1A_C_albicans_SC5314/database.fasta.generating.std). Can you send here the content of this file?

Is this happening always or just in some runs randomly?

Thanks,

Miquel Àngel Schikora

lianglunping commented 2 years ago

Hi @MikiSchikora

Thank you for your help.Here are my answer 1.The standard output of this line of command is shown in the attachment. I hope this helps you to understand more information database.fasta.generating.std.txt 2.The issue always happening when i run this command.

Thanks and regards,

Lunping Liang

lianglunping commented 2 years ago

I want to tell you I'm success in to aovid this issus by Traditional installation.Unfortunately, I'm meet a new problem in Running module find_homologous_regions.Here are the error infor:

Running module find_homologous_regions...
[25/08/2022, 14:45:43] Running perSVade find_homologous_regions into ./installation/test_installation/testing_outputs/find_hom_regions with 1725.853 Gb of RAM and 288 cores
[25/08/2022, 14:45:52] Getting blastn of genome against itself
[25/08/2022, 14:46:41] Getting homologous regions
[25/08/2022, 14:46:42] perSVade find_homologous_regions finished correctly
Running module infer_repeats...[25/08/2022, 14:48:36] Running perSVade infer_repeats into ./installation/test_installation/testing_outputs/repeats_infer_Cglab with 1725.369 Gb of RAM and 288 coresTraceback (most recent call last):
  File "/public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/infer_repeats", line 121, in <module>    repeats_df, repeats_table_file = fun.get_repeat_maskerDF(opt.ref, threads=opt.threads, replace=opt.replace)  File "/public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/sv_functions.py", line 16520, in get_repeat_maskerDF
    repeat_masker_outfile_personal, repeat_masker_outfile_default = run_repeat_masker(reference_genome, threads=threads, replace=replace, use_repeat_modeller=use_repeat_modeller)
  File "/public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/sv_functions.py", line 16467, in run_repeat_masker
    if use_repeat_modeller is True: library_repeats_repeatModeller, new_families_identified =  run_repeat_modeller(reference_genome, threads=threads, replace=replace)
  File "/public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/sv_functions.py", line 16411, in run_repeat_modeller
    run_cmd(build_db_cmd, env=EnvName_RepeatMasker)
  File "/public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/sv_functions.py", line 691, in run_cmd
    if out_stat!=0: raise ValueError("\n%s\n did not finish correctly. Out status: %i"%(cmd_to_run, out_stat))
ValueError: 
source /public/home/lianglunping/miniconda3/etc/profile.d/conda.sh && conda activate perSVade_RepeatMasker_env > /dev/null 2>&1 && cd /public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/testing_outputs/repeats_infer_Cglab/reference_genome_dir/reference_genome.fasta.repeat_modeler_outdir && /public/home/lianglunping/miniconda3/envs/perSVade_RepeatMasker_env/bin/BuildDatabase -name reference_genome.fasta /public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/testing_outputs/repeats_infer_Cglab/reference_genome_dir/reference_genome.fasta.repeat_modeler_outdir/reference_genome.fasta > /public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/testing_outputs/repeats_infer_Cglab/reference_genome_dir/reference_genome.fasta.repeat_modeler_outdir/reference_genome.fasta.genearting_db.std 2>&1
 did not finish correctly. Out status: 32512

ERROR!!! Out status: 256
Traceback (most recent call last):
  File "./installation/test_installation/test_installation_modules.py", line 181, in <module>
    fun.run_cmd("%s infer_repeats --threads %i --fraction_available_mem 1.0 -o %s --ref %s --min_chromosome_len 100"%(fun.perSVade_modules, threads, outdir_repeats_fast, ref_genome))
  File "./installation/test_installation/../../scripts/sv_functions.py", line 691, in run_cmd
    if out_stat!=0: raise ValueError("\n%s\n did not finish correctly. Out status: %i"%(cmd_to_run, out_stat))
ValueError: 
source /public/home/lianglunping/miniconda3/etc/profile.d/conda.sh && conda activate perSVade > /dev/null 2>&1 && /public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/../../scripts/perSVade infer_repeats --threads 288 --fraction_available_mem 1.0 -o ./installation/test_installation/testing_outputs/repeats_infer_Cglab --ref ./installation/test_installation/testing_outputs/reduced_genome.fasta --min_chromosome_len 100
 did not finish correctly. Out status: 2560

could you tell me how to deal with this issue?

Thanks and regards,

Lunping Liang

MikiSchikora commented 2 years ago

Hi,

Regarding the docker problem, I have seen this topic suggesting that it could be related to the filesystem not supporting file locking in the BLAST runs. I am not sure how to solve this, but this may also be helpful. Can you try it on a different filesystem (or try to fix the file locking) and see if the problem persists?

Regarding the traditional installation, it seems that the infer_repeats module is failing. The error happens at the database building step, so I suspect it may be similar to what happens with the docker image. Can you send me the content of ' /public/home/lianglunping/perSVade/perSVade-1.02.7/installation/test_installation/testing_outputs/repeats_infer_Cglab/reference_genome_dir/reference_genome.fasta.repeat_modeler_outdir/reference_genome.fasta.genearting_db.std ' (or attach the file here)? This may be useful to understand how to solve the problem.

Have you tried with the singularity image (the recommended installation option, that we've seen usually works best)?

Kind regards,

Miquel Àngel Schikora Tamarit

lianglunping commented 2 years ago

Hi, Thanks for your help.Here are my response:

1.Due to the fact that only one server can be chosen, I don't seem to be able to achieve this.

2.The details of reference_genome.fasta.genearting_db.std as followers:

/public/home/lianglunping/.perSVade_tmp/AJBFQDUGXQBVTXS.sh:Line1: /public/home/lianglunping/miniconda3/envs/perSVade_RepeatMasker_env/bin/BuildDatabase: No such file

3.Yes,I have tried with the singularity image.Unfortunately, I'm also meet some problem like "mdb_env_open: Function not implemented" and I need to eidt the sv_function.py due to some python packages update.

Kind regards,

Lunping Liang

MikiSchikora commented 2 years ago

Hi,

Regarding the singularity and docker problems, I am afraid that this seems like an interaction of the perSVade image with your filesystem (maybe because of the locking that we discussed) and I don't know how I can help. Maybe some sysadmins from your institution can help.

Regarding the traditional installation, it seems that the BuildDatabase binary is not installed for some reason which I can't understand. Can you verify that you installed the expected version of repeatmasker (4.0.9_p2 ) and repeatmodeler (2.0.1)? I know that the conda installations may not be 100% reproducible, and I wonder if mamba is trying to install a buggy dependencies when you run mamba env create --file installation/RepeatMasker_env.yml --name $PERSVADE_ENV_NAME'_RepeatMasker_env'. Can you verify that you ran all the steps of traditional installation and that you have the correct dependencies? If so, can you check if the BuildDatabase binary is somewhere else under the RepeatMasker_env folder? If so, I may be able to work on a patch.

Kind regards,

Miquel Àngel Schikora Tamarit

MikiSchikora commented 2 years ago

Hi,

Could you solve this issue? If so I will close the issue.

Kind regards,

Miquel Àngel Schikora Tamarit

lianglunping commented 2 years ago

Hi, Thanks for your help. This issue has been resolved.

Kind regards,

Lunping Liang