bergmanlab / mcclintock

Meta-pipeline to identify transposable element insertions using next generation sequencing data
93 stars 30 forks source link

teflon error #76

Closed tomaszjacek closed 3 years ago

tomaszjacek commented 3 years ago

Hi,

When I run the teflon analysis with command

python3 mcclintock.py \ -r /work/mcclintock/test/sacCer2.fasta \ -c /work/mcclintock/test/sac_cer_TE_seqs.fasta \ -g /work/mcclintock/test/reference_TE_locations.gff \ -t /work/mcclintock/test/sac_cer_te_families.tsv \ -1 /data/mcclintock/test/SRR800842_1.fastq.gz \ -2 /data/mcclintock/test/SRR800842_2.fastq.gz \ -p 10 \ -m teflon \ -o /data/mcclintock/test/output/

I got the error

Job counts: count jobs 1 make_consensus_fasta 1 make_reference_fasta 1 make_te_annotations 1 setup_reads 1 summary_report 1 teflon_post 1 teflon_preprocessing 1 teflon_run 8 Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only. Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only. python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20 [Thu Jan 14 18:10:01 2021] Error in rule teflon_run: jobid: 2 output: /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt conda-env: /work/mcclintock/install/envs/conda/c707b3e8

RuleException: CalledProcessError in line 49 of /work/mcclintock/snakefiles/teflon.snakefile: Command 'source /opt/conda/envs/mcclintock/bin/activate '/work/mcclintock/install/envs/conda/c707b3e8'; set -euo pipefail; /opt/conda/envs/mcclintock/bin/python3.7 /data/mcclintock/test/output/snakemake/3802957/.snakemake/scripts/tmpifvv9aex.teflon_run.py' returned non-zero exit status 1. File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2189, in run_wrapper File "/work/mcclintock/snakefiles/teflon.snakefile", line 49, in rule_teflon_run File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init__.py", line 529, in _callback File "/opt/conda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 515, in cached_or_run File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2201, in run_wrapper Exiting because a job execution failed. Look above for error message snakemake --use-conda --conda-prefix /work/mcclintock/install/envs/conda --quiet --configfile /data/mcclintock/test/output/snakemake/config/config_3802957.json --cores 10 /data/mcclintock/test/output/SRR800842_1/results/teflon/SRR800842_1_teflon_nonredundant.bed /data/mcclintock/test/output/SRR800842_1/results/summary/data/run/summary_report.txt

Is it bug of teflon software? or I should use some extraa option in command?

Thank you, tj

pbasting commented 3 years ago

Hi @tomaszjacek,

can you post the contents of the TEFLoN specific log? That should make it easier for me to determine what is going wrong. Based on the paths in the error you posted, the TEFLoN log should be at: /data/mcclintock/test/output/log/*/teflon.log

Thanks, Preston

tomaszjacek commented 3 years ago

Im sorry i dont know how to attach the file. is it possible here? So, I have to pste it. teflon.log file is 1135 lines long with many times "Processed 990100 reads..." but ends with error

Thank you, tj

[M::mem_process_seqs] Processed 990100 reads in 83.325 CPU sec, 8.546 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (217, 401435, 74, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (61, 132, 672)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1894)
[M::mem_pestat] mean and std.dev: (313.10, 375.83)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 2505)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (276, 301, 320)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (188, 408)
[M::mem_pestat] mean and std.dev: (298.18, 33.74)
[M::mem_pestat] low and high boundaries for proper pairs: (144, 452)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (257, 3703, 9499)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27983)
[M::mem_pestat] mean and std.dev: (4134.85, 3903.56)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37225)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (495, 753, 1247)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2751)
[M::mem_pestat] mean and std.dev: (747.34, 386.19)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3503)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 86.595 CPU sec, 8.862 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (205, 396794, 77, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (62, 140, 510)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1406)
[M::mem_pestat] mean and std.dev: (314.52, 367.90)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1854)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (275, 301, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (187, 407)
[M::mem_pestat] mean and std.dev: (297.60, 34.11)
[M::mem_pestat] low and high boundaries for proper pairs: (143, 451)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (271, 4322, 8277)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 24289)
[M::mem_pestat] mean and std.dev: (3993.38, 3576.53)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 32295)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (449, 703, 1217)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2753)
[M::mem_pestat] mean and std.dev: (687.53, 371.81)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3521)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.206 CPU sec, 9.404 real sec
[M::process] read 918116 sequences (92729716 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (211, 394908, 65, 89)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (71, 135, 446)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1196)
[M::mem_pestat] mean and std.dev: (211.70, 216.78)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1571)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (274, 300, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (184, 409)
[M::mem_pestat] mean and std.dev: (296.91, 34.69)
[M::mem_pestat] low and high boundaries for proper pairs: (139, 454)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (285, 2584, 9521)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27993)
[M::mem_pestat] mean and std.dev: (3933.18, 3790.37)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37229)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (404, 643, 1227)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2873)
[M::mem_pestat] mean and std.dev: (683.83, 464.21)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3696)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.694 CPU sec, 9.479 real sec
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (174, 337492, 61, 93)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (69, 131, 548)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1506)
[M::mem_pestat] mean and std.dev: (310.03, 353.80)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1985)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (271, 298, 317)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (179, 409)
[M::mem_pestat] mean and std.dev: (294.40, 35.79)
[M::mem_pestat] low and high boundaries for proper pairs: (133, 455)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (308, 2984, 9472)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27800)
[M::mem_pestat] mean and std.dev: (4027.59, 3658.31)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 36964)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (513, 721, 809)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1401)
[M::mem_pestat] mean and std.dev: (719.77, 315.99)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1984)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 918116 reads in 97.453 CPU sec, 9.857 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq
[main] Real time: 389.589 sec; CPU: 3788.028 sec
bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam
samtools view -Sb /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
[bam_sort_core] merging from 20 files...
samtools sort -@ 10 -o /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
samtools index /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
awk: line 1: syntax error at or near *
Calculating alignment statistics
cmd: samtools stats -t /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/teflon.genomeSize.txt /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
cmd: samtools depth -Q 20 /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.cov.txt
Insert size standard deviation estimated as 45. Use the override option if you suspect this is incorrect!
Warning: coverage could not be estimated, enter coverage manually
python /work/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 10 -q 20
Traceback (most recent call last):
  File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 165, in <module>
    main()
  File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 103, in main
    samples.append([line.split()[0], line.split()[1], [readLen, insz, sd, total_n,cov,cov_sd]])
UnboundLocalError: local variable 'cov' referenced before assignment
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20

-bash-4.2$ wc -l teflon.log
cbergman commented 3 years ago

@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests

zhjpeng commented 3 years ago

Hi, when I run McClintock as following:

python3 ${MCK}/mcclintock.py --reference ../10-reference/HaSCD2.fa \
                                 --consensus ../10-reference/Hadb-families_rename.fa \
                                 --first ../20-NGS/${K}/${K}_1.fastq \
                                 --second ../20-NGS/${K}/${K}_2.fastq \
                                 --proc 48 \
                                 --out ${K} \
                                 --locations ./TE_annotations/HaSCD2/reference_te_locations/unaugmented_inrefTEs.gff \
                                 --taxonomy ./TE_annotations/HaSCD2/te_taxonomy/unaugmented_taxonomy.tsv

I got some errors related to teflon as following:

Error in rule teflon_run:
    jobid: 20
    output: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt
    conda-env: /home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7

RuleException:
CalledProcessError in line 49 of /home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile:
Command 'source /home/dell/miniconda3/envs/mcclintock/bin/activate '/home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7'; set -euo pipefail;  /home/dell/miniconda3/envs/mcclintock/bin/python3.7 /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/snakemake/1571076/.snakemake/scripts/tmpc34m4ip0.teflon_run.py' returned non-zero exit status 1.
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
  File "/home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper

teflon.log as following

writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 4 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
search space succesfully reduced...
new reduced bam file: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.sam_files/mega_complete.bam
clustering TE positions...
[ ================================================== ] 100.00%
clustering TE positions completed!
final reduction of search space...
cmd: samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20

when I run the samtools view manually as

samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b

I got error as following:

[bed_read] Parse error reading "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed" at line 63797
samtools view: Could not read file "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed"

therefore, I get the line 63797 of /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed as following 4007749 it just included one site, may be start or end? Meanwhile, I found another potential error in as following chr19 4007485 Unchr32 651720 651859 it seems to be chimeric records.

So, the error above may occur during clustering TE positions?

pbasting commented 3 years ago

@tomaszjacek

pbasting commented 3 years ago
zhjpeng commented 3 years ago
  • @zhjpeng (#76 (comment)) I have seen this issue before as well. It seems to be sample dependent. Most of my McClintock runs with TEFLoN do not have this issue but some specific samples will have this occur where the mega_clustered.bed is malformed.
  • I am fairly certain this is a bug in TEFLoN and not related to mcclintock, so I am going to work on replicating this bug outside of McClintock with just TEFLoN. Then I'll open an issue on the actual TEFLoN repository (https://github.com/jradrion/TEFLoN) to see if their developers know what is going on.
  • I'll let you know when I've posted the issue

Thanks for your reply, I am running mcclintock in more samples and check whether other samples have similar errors.

tomaszjacek commented 3 years ago

@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests

Thank you, tj

pbasting commented 3 years ago
cbergman commented 3 years ago
tomaszjacek commented 3 years ago
  • @tomaszjacek I've updated the mcclintock master branch b61563e with the change to the TEFLoN environment that now includes gawk. You should be able to update your mcclintock repository with a git pull. Then you should do a clean install with mcclintock.py --install which will install TEFLoN with the updated conda environment.
  • Let me know if this resolves the bug you were experiencing earlier.

@pbasting

It works, Thank you, tj

yuryfunikov commented 3 years ago

unfortunately git pull && mcclintock.py --install didn't help me is there any way to verify teflon was updated and/or a way to get a component version being used?

pbasting commented 3 years ago

Hi @yuryfunikov ,

Thanks!

Preston

yuryfunikov commented 3 years ago

Hi and thanks for the answer,

this is what i got:

  1. i ran git pull && mcclintock.py --install
  2. git rev-parse HEAD
    mcclintock$ git rev-parse HEAD
    5849097de4f74b0b8b149cad138e31024082924c
  3. then i ran:
    python3 ./../mcclintock/mcclintock.py -r dvir-all-chromosome-r.1.06.fasta -c asymmetric_TEs_v1.fasta -1 160JB_dna_seq_1_trimmed.fastq.gz -2 160JB_dna_seq_2_trimmed.fastq.gz -p 1 -m teflon -o mcclintock_out_assTEv1_160_refgen/ --resume --debug

    that resulted in following error:

    RuleException:
    CalledProcessError in line 49 of /path/to/file/mcclintock/snakefiles/teflon.snakefile:
    Command 'source /opt/miniconda/envs/mcclintock/bin/activate '/path/to/file/mcclintock/install/envs/conda/cc1216b5'; set -euo pipefail;  /opt/miniconda/envs/mcclintock/bin/python3.7 /path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/scripts/tmp6dm6acdf.teflon_run.py' returned non-zero exit status 1.
    File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
    File "/path/to/filemcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
    File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
    File "/opt/miniconda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
    File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper
    Shutting down, this might take some time.
    Exiting because a job execution failed. Look above for error message
    Complete log: //path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/log/2021-03-15T001010.010823.snakemake.log
  4. then i checked teflon log
    -rw-rw-r-- 1 sergey sergey   2425 Mar 15 00:16 ./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log

    ./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log:

writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 1 -L /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
py

and i must say that it looks like mega_complete.bed wasn't created at all:

/path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed: No such file or directory

also i should say that the pipeline used to be working without problems but then it stated failing with this error from time to time and now it fails every time we run the script

pls let me know if you think i should file a new ticket regarding this

pbasting commented 3 years ago

Thanks @yuryfunikov this looks like a similar problem as described in: https://github.com/bergmanlab/mcclintock/issues/76#issuecomment-761018575. We have contacted the TEFLoN developer and I think that the bug has been fixed (see: https://github.com/jradrion/TEFLoN/issues/8) but I am currently testing it and integrating the changes in mcclintock. I'll let you know when these changes have been integrated.

yuryfunikov commented 3 years ago

hi

sorry for bothering but have you had a chance to look into this?

pbasting commented 3 years ago

@yuryfunikov Sorry for not replying earlier, but I have integrated the most recent update to TEFLoN into mcclintock. So I'd suggest re-installing the newest version of mcclintock: https://github.com/bergmanlab/mcclintock/commit/40863acf11052b18afb4cdcd7b1124de48cba397 and trying TEFLoN again on your sample to see if the issue is resolved