simon-anders / htseq

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.
https://htseq.readthedocs.io/en/release_0.11.1/
GNU General Public License v3.0
122 stars 77 forks source link

-o flag returns error: "too few arguments" #71

Closed rbenel closed 5 years ago

rbenel commented 5 years ago

Hi, A few days ago I ran htseq-count with the -o flag and I received an output of counts and an additional .sam file with gene names attached to the header lines of my alignment.sam file. Yesterday I tried to run the same command again, but now I receive an error message usage: htseq-count [options] alignment_file gff_file htseq-count: error: too few arguments

It is important to note that without the -o flag I receive a count file. This is the command I am using /usr/bin/htseq-count -a 0 -o sample0074_o.sam /path/to/sample_0082.sam /path/to/gencode.v29.primary_assembly.annotation.gff3 > sample0074_htseq.txt

OS info:

NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Thank you in advance :)

iosonofabio commented 5 years ago

I think I know the issue, but just to clarify: what version of htseq are you using?

rbenel commented 5 years ago

htseq-count -h returns these lines at the end of the output: Part of the 'HTSeq' framework, version 0.11.0.

I have tried upgrading to version 0.11.1 and downgrading to version 0.10.0 and the issue persists...

rbenel commented 5 years ago

The most "recent" version I found this flag works on is 0.6.0 ... is this a reliable version to work with as a temporary fix?

iosonofabio commented 5 years ago

It's a bug, my fault for not understanding argparse properly. Will push 0.11.2 this week to fix 👍

rbenel commented 5 years ago

OK. Thanks, in the meantime I will continue with 0.6.0.

iosonofabio commented 5 years ago

Fixed in 0.11.2, just call -o multiple times e.g. -o file1.sam -o file2.sam. The number of output SAM files must be equal the number of input BAM/SAM files.

rbenel commented 5 years ago

How can it be that the output SAM files be equal to the number of input SAM files? The point of the -o flag is to output a .SAM file with each line annotated with a feature assignment appended to the header row. Then when using > to obtain a counts.txt file... but there is still only one .SAM input file....

rbenel commented 5 years ago

Regardless, the 0.11.2 still returns the same error. The instillation seems to have worked fine. Found existing installation: HTSeq 0.11.0 Uninstalling HTSeq-0.11.0: Successfully uninstalled HTSeq-0.11.0 Successfully installed HTSeq-0.11.2

htseq-count -o sample0074_geneAttribute.sam -o counts_0074.txt ./sample0074.sam ./gencode.v29.primary_assembly.annotation.gff3

usage: htseq-count [options] alignment_file gff_file htseq-count: error: too few arguments

iosonofabio commented 5 years ago

It seems like you are not very familiar with the documentation of htseq-count. Since version 0.7 or so you can input more than one SAM/BAM file and get one single table of counts into stdout. In addition, you can annotate every input SAM/BAM with the feature assignment using the -o flag and you will get, no surprise, one annotated SAM out for every unannotated SAM/BAM you put in. We don't merge SAM/BAM files, because that would lead to loss of information ("which file does this read come from?").

As of your later comment, you seem to erroneously believe that -o is used for the counts table. Just to make sure we are on the same page: the count table goes into stdout. Here's how you do that particular call:

htseq-count -o sample0074_geneAttribute.sam ./sample0074.sam ./gencode.v29.primary_assembly.annotation.gff3 > counts_0074.txt

Also let me remind you we like GTF better than GFF3 in general.