CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
481 stars 190 forks source link

umi_tools dedup Error: How to solve this problem? #463

Closed geng-lee closed 3 years ago

geng-lee commented 3 years ago

sort: unrecognized option '--no-PG'

Traceback (most recent call last):

File "/anaconda3/envs/clipper3/bin/umi_tools", line 8, in

sys.exit(main())

File "/anaconda3/envs/clipper3/lib/python3.7/site-packages/umi_tools/umi_tools.py", line 61, in main

module.main(sys.argv)

File "/envs/clipper3/lib/python3.7/site-packages/umi_tools/dedup.py", line 373, in main

pysam.sort("-o", sorted_out_name, "-O", sort_format, "--no-PG", out_name)

File "/backup/home/changxing/anaconda3/envs/clipper3/lib/python3.7/site-packages/pysam/utils.py", line 75, in call

stderr))

pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=Usage: samtools sort [options...] [in.bam]\nOptions:\n -l INT Set compression level, from 0 (uncompressed) to 9 (best)\n -m INT Set maximum memory per thread; suffix K/M/G recognized [768M]\n -n Sort by read name\n -t TAG Sort by value of TAG. Uses position as secondary index (or read name if -n is set)\n -o FILE Write final output to FILE rather than standard output\n -T PREFIX Write temporary files to PREFIX.nnnn.bam\n --input-fmt-option OPT[=VAL]\n Specify a single input file format option in the form\n of OPTION or OPTION=VALUE\n -O, --output-fmt FORMAT[,OPT[=VAL]]...\n Specify output format (SAM, BAM, CRAM)\n --output-fmt-option OPT[=VAL]\n Specify a single output file format option in the form\n of OPTION or OPTION=VALUE\n --reference FILE\n Reference sequence FASTA FILE [null]\n -@, --threads INT\n Number of additional threads to use [0]\n'

Detailed information


umi_tools dedup -I result/sorted/input_rep1.align.rmRep.sorted.bam --output-stats=result/deduplicated -S result/deduplicated/deduplicated.input_rep1.align.rmRep.sorted.bam

# UMI-tools version: 1.1.1 # output generated by dedup -I result/sorted/input_rep1.align.rmRep.sorted.bam --output-stats=result/deduplicated -S result/deduplicated/deduplicated.input_rep1.align.rmRep.sorted.bam # job started at Mon Mar 15 18:08:01 2021 on ibmnode1 -- ab7fab97-82aa-4011-a86c-ecc673f31c0f # pid: 5624, system: Linux 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 17:27:01 UTC 2016 x86_64 # assigned_tag : None # cell_tag : None # cell_tag_delim : None # cell_tag_split : - # chimeric_pairs : use # chrom : None # compresslevel : 6 # detection_method : None # filter_umi : None # gene_tag : None # gene_transcript_map : None # get_umi_method : read_id # ignore_tlen : False # ignore_umi : False # in_sam : False # log2stderr : False # loglevel : 1 # mapping_quality : 0 # method : directional # no_sort_output : False # out_sam : False # output_unmapped : False # paired : False # per_cell : False # per_contig : False # per_gene : False # random_seed : None # read_length : False # short_help : None # skip_regex : ^(__|Unassigned) # soft_clip_threshold : 4 # spliced : False # stats : result/deduplicated # stderr : <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> # stdin : <_io.TextIOWrapper name='result/sorted/input_rep1.align.rmRep.sorted.bam' mode='r' encoding='UTF-8'> # stdlog : <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> # stdout : <_io.TextIOWrapper name='result/deduplicated/deduplicated.input_rep1.align.rmRep.sorted.bam' mode='w' encoding='UTF-8'> # subset : None # threshold : 1 # timeit_file : None # timeit_header : None # timeit_name : all # tmpdir : None # umisep : # umi_tag : RX # umi_tag_delim : None # umi_tag_split : None # umi_whitelist : None # umi_whitelist_paired : None # unmapped_reads : discard # unpaired_reads : use # whole_contig : False 2021-03-15 18:08:01,465 INFO command: dedup -I result/sorted/input_rep1.align.rmRep.sorted.bam --output-stats=result/deduplicated -S result/deduplicated/deduplicated.input_rep1.align.rmRep.sorted.bam 2021-03-15 18:08:07,557 INFO total_umis 1892373 2021-03-15 18:08:07,557 INFO #umis 582414 2021-03-15 18:08:13,400 INFO Written out 100000 reads 2021-03-15 18:08:18,946 INFO Written out 200000 reads 2021-03-15 18:08:31,990 INFO Written out 300000 reads 2021-03-15 18:08:38,294 INFO Written out 400000 reads 2021-03-15 18:08:44,221 INFO Written out 500000 reads 2021-03-15 18:08:50,574 INFO Written out 600000 reads 2021-03-15 18:08:56,445 INFO Written out 700000 reads 2021-03-15 18:08:59,143 INFO Parsed 1000000 input reads 2021-03-15 18:09:01,941 INFO Written out 800000 reads 2021-03-15 18:09:07,439 INFO Written out 900000 reads 2021-03-15 18:09:13,973 INFO Written out 1000000 reads 2021-03-15 18:09:20,717 INFO Written out 1100000 reads 2021-03-15 18:09:32,772 INFO Written out 1200000 reads 2021-03-15 18:09:40,997 INFO Written out 1300000 reads sort: unrecognized option '--no-PG' Traceback (most recent call last): File "/anaconda3/envs/clipper3/bin/umi_tools", line 8, in sys.exit(main()) File "/anaconda3/envs/clipper3/lib/python3.7/site-packages/umi_tools/umi_tools.py", line 61, in main module.main(sys.argv) File "/anaconda3/envs/clipper3/lib/python3.7/site-packages/umi_tools/dedup.py", line 373, in main pysam.sort("-o", sorted_out_name, "-O", sort_format, "--no-PG", out_name) File "/anaconda3/envs/clipper3/lib/python3.7/site-packages/pysam/utils.py", line 75, in call stderr)) pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=Usage: samtools sort [options...] [in.bam]\nOptions:\n -l INT Set compression level, from 0 (uncompressed) to 9 (best)\n -m INT Set maximum memory per thread; suffix K/M/G recognized [768M]\n -n Sort by read name\n -t TAG Sort by value of TAG. Uses position as secondary index (or read name if -n is set)\n -o FILE Write final output to FILE rather than standard output\n -T PREFIX Write temporary files to PREFIX.nnnn.bam\n --input-fmt-option OPT[=VAL]\n Specify a single input file format option in the form\n of OPTION or OPTION=VALUE\n -O, --output-fmt FORMAT[,OPT[=VAL]]...\n Specify output format (SAM, BAM, CRAM)\n --output-fmt-option OPT[=VAL]\n Specify a single output file format option in the form\n of OPTION or OPTION=VALUE\n --reference FILE\n Reference sequence FASTA FILE [null]\n -@, --threads INT\n Number of additional threads to use [0]\n'

IanSudbery commented 3 years ago

This is happening due to having the wrong version of pysam/samtools/htslib installed.

You should be able to fix this with

conda install pysam>=0.16.0.1 -c bioconda -c conda-forge
geng-lee commented 3 years ago