galaxyproject / tools-devteam

Contains a set of Galaxy Tools mostly written by the Galaxy Team.
36 stars 92 forks source link

Update Tophat for Collection-Aware Read Group Mapping #202

Open jmchilton opened 9 years ago

jmchilton commented 9 years ago

Last bit of https://github.com/galaxyproject/tools-devteam/issues/92 which was mostly dealt with.

The problem with tophat is that it does not allow arbitrary read groups or at least all of the valid Broad/SAM specification ones. We could pick and choose pieces of the macros to reuse and adapt the conditionals to tophat - or we can try to get tophat to allow the remaining headers to be specifiable.

Tophat source code is at https://github.com/infphilo/tophat.

Current header supported:

    --rg-id                        <string>    (read group ID)
    --rg-sample                    <string>    (sample ID)
    --rg-library                   <string>    (library ID)
    --rg-description               <string>    (descriptive string, no tabs allowed)
    --rg-platform-unit             <string>    (e.g Illumina lane ID)
    --rg-center                    <string>    (sequencing center name)
    --rg-date                      <string>    (ISO 8601 date of the sequencing run)
    --rg-platform                  <string>    (Sequencing platform descriptor)

Missing options include pi, ks, fo, pg.

See SAM specification for definitions of these options - https://samtools.github.io/hts-specs/SAMv1.pdf.

lparsons commented 8 years ago

Any movement on this, or thoughts on how we should proceed? https://github.com/galaxyproject/galaxy/issues/2006 would be an excellent stopgap (and help with other issues as well).

lparsons commented 6 years ago

Is this abandoned? I'd be plenty happy omitting the few lesser used headers from the macros and getting something functional here. Tophat2 is well known and still used quite often. Lack of proper support for collections (which this falls under) is a problem.