Closed danicats closed 2 years ago
Thanks for reporting @danicats. My first impression is that it is an environment-related issue. Could you clarify the following information regarding your failed run so that we could provide some more insightful comments?
python3 mcclintock.py --install
.mcclintock.py
on the test dataset.Hi Shunhua,
Thanks so much for your help!
I just downloaded Mcclintock so it should be the latest version. I already had conda installed and I used it to install mcclintock. I didn't see any errors when I was installing but I could try reinstalling.
These are the commands that I ran:
to download test data: python3 test/download_test_data.py
to run mcclintock.py: python3 mcclintock.py -r test/sacCer2.fasta -c test/sac_cer_TE_seqs.fasta -g test/reference_TE_locations.gff -t test/sac_cer_te_families.tsv -1 test/SRR800842_1.fastq.gz -2 test/SRR800842_2.fastq.gz -p 4 -o test_data
I will try recreating the conda environment and see if that helps.
Thanks again,
Danica
Hi- thanks for the help! I dug deeper into the log files, and I found that there was this error in one of the log files:
Building a new DB, current time: 11/15/2021 12:55:24
New DB name: /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_again/160309_MONK_0468_AC83LBACXX_L6_AAGAGGCA-AAGGAGTA_1/tmp/repeatmasker/RM_48825.MonNov151255052021/consensusTEs.fasta
New DB title: /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_again/sacCer2/consensus_fasta/consensusTEs.fasta
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
No volumes were created.
Error: mdb_env_open: Cannot allocate memory
After some searching I found that this is a recurring problem with blast and the solution is to run:
export BLASTDB_LMDB_MAP_SIZE=100000000
In case anyone has the same problem in the future here is the thread for this solution: https://www.biostars.org/p/413294/
It seems to be running without a hitch now! Thanks!
Great news on finishing the test run successfully and thanks for providing a quick solution! @danicats Looks like this issue could be computing-resource-specific (e.p. it may relate to how much virtual memory is available in the system). We will also look into this issue and see if we could solve it within the McClintock system so that users don't have to do this hack.
Hi, thank you for maintaining this tool! I am attempting to run the test data and I am getting the following error:
Error in rule repeatmask: jobid: 31 output: /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/SRR800842_1/intermediate/sacCer2.repeatmasker.out conda-env: /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/install/envs/conda/868b58eb
RuleException: CalledProcessError in line 306 of /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/snakemake/8262709/Snakefile: Command 'source /home/danicats/miniconda3/envs/mcclintock/bin/activate '/oak/stanford/scg/lab_asbhatt/danicats/mcclintock/install/envs/conda/868b58eb'; set -euo pipefail; python /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/snakemake/8262709/.snakemake/scripts/tmpi8viig8f.repeatmask.py' returned non-zero exit status 1. File "/home/danicats/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2340, in run_wrapper File "/oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/snakemake/8262709/Snakefile", line 306, in rule_repeatmask File "/home/danicats/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 568, in _callback File "/home/danicats/miniconda3/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/home/danicats/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 554, in cached_or_run File "/home/danicats/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init__.py", line 2352, in run_wrapper
and here is the log file
samtools faidx /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta [bwa_index] Pack FASTA... 0.14 sec [bwa_index] Construct BWT for the packed sequence... faToTwoBit /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/SRR800842_1/intermediate/genome_fasta/sacCer2.aug.fasta.2bit [bwa_index] 5.92 seconds elapse. [bwa_index] Update BWT... 0.10 sec [bwa_index] Pack forward-only FASTA... 0.13 sec [bwa_index] Construct SA from BWT and Occ... 1.76 sec [main] Version: 0.7.4-r385 [main] CMD: bwa index /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta [main] Real time: 8.170 sec; CPU: 8.059 sec bwa index /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta RepeatMasker version open-4.0.7 Search Engine: NCBI/RMBLAST [ 2.10.0+ ] Warning...unknown stuff <
Building general libraries in: /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/install/envs/conda/868b58eb/share/RepeatMasker/Libraries/dc20170127/general RepeatMasker::createLib(): Error invoking /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/install/envs/conda/868b58eb/bin/makeblastdb on file /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/install/envs/conda/868b58eb/share/RepeatMasker/Libraries/dc20170127/general/at.lib. RepeatMasker -pa 1 -lib /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/consensus_fasta/consensusTEs.fasta -dir /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/SRR800842_1//tmp/repeatmasker -s -nolow -no_is /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta RepeatMasker -pa 1 -lib /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/consensus_fasta/consensusTEs.fasta -dir /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/SRR800842_1//tmp/repeatmasker -s -nolow -no_is /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2.fasta bedtools maskfasta -fi /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/genome_fasta/sacCer2_unaugmented.fasta -fo /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/SRR800842_1//tmp/8262709tmpmaskedreference.fasta -bed /oak/stanford/scg/lab_asbhatt/danicats/mcclintock/test_data_2/sacCer2/reference_te_locations/unaugmented_inrefTEs.gff
it looks like it is having trouble "invoking makebastdb" but when I confirm the location of the makebastdb it is clearly there
Any help would be appreciated! Thanks in advance!