GMOD / Apollo

Genome annotation editor with a Java Server backend and a Javascript client that runs in a web browser as a JBrowse plugin.
http://genomearchitect.readthedocs.io/
Other
126 stars 85 forks source link

display projected BAM tracks #686

Closed nathandunn closed 6 years ago

nathandunn commented 8 years ago

Doing this already with the REST API using the htseq API from BROAD:

== DONE ==

===

nathandunn commented 7 years ago

Uses this: https://github.com/timpalpant/java-genomics-io/tree/master/src/edu/unc/genomics/io/BAMFileReader

deepakunni3 commented 7 years ago

@nathandunn I think you are referring to htslib API from Broad: https://samtools.github.io/htsjdk/

htseq is an entirely different tool written in Python for estimating read counts from RNAseq BAMs ;)

nathandunn commented 7 years ago

Yes that is right. I think I ended up using the genomics Java lib from unc for bigwig and the same for bam.

nathandunn commented 7 years ago

Zoomed out histogram view will happen automatically once I populate those methods.

nathandunn commented 7 years ago
screen shot 2017-07-24 at 4 48 24 pm
nathandunn commented 7 years ago
NO Track assemblage null and {"organism":"21","clientToken":"11410287301476095970","sequenceList":[{"organism":"Honeybee","name":"GroupUn1076","start":0,"length":970,"end":970,"location":[{"fmin":0,"fmax":970}],"id":449,"reverse":false}]}
params [controller:bam, organismId:21, action:global, trackName:Forager RNA-Seq reads]
| Error 2017-07-25 16:28:08,278 [http-bio-8080-exec-13] ERROR errors.GrailsExceptionResolver  - SAMFormatException occurred when processing request: [GET] /apollo/bam/Forager%20RNA-Seq%20reads/21/stats/global
Error parsing SAM header. @HD line has non-conforming SO tag value: sorted.. Line:
| Error 2017-07-25 16:28:08,278 [http-bio-8080-exec-13] ERROR errors.GrailsExceptionResolver  - SAMFormatException occurred when processing request: [GET] /apollo/bam/Forager%20RNA-Seq%20reads/21/stats/global
Error parsing SAM header. @HD line has non-conforming SO tag value: sorted.. Line:
@HD VN:1.0  SO:sorted; File /opt/apollo/honeybee/data/bam/forager_Amel4.5_accepted_hits.bam; Line number 1. Stacktrace follows:
Message: Error parsing SAM header. @HD line has non-conforming SO tag value: sorted.. Line:
@HD VN:1.0  SO:sorted; File /opt/apollo/honeybee/data/bam/forager_Amel4.5_accepted_hits.bam; Line number 1
    Line | Method
->>  265 | reportErrorParsingLine in htsjdk.samtools.SAMTextHeaderCodec
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
|    236 | parseHDLine            in     ''
|    102 | decode . . . . . . . . in     ''
|    667 | readHeader             in htsjdk.samtools.BAMFileReader
|    298 | <init> . . . . . . . . in     ''
|    176 | <init>                 in     ''
|    376 | open . . . . . . . . . in htsjdk.samtools.SamReaderFactory$SamReaderFactoryImpl
|    202 | open                   in     ''
|    107 | global . . . . . . . . in org.bbop.apollo.BamController
|    198 | doFilter               in grails.plugin.cache.web.filter.PageFragmentCachingFilter
|     63 | doFilter . . . . . . . in grails.plugin.cache.web.filter.AbstractFilter
|    449 | executeChain           in org.apache.shiro.web.servlet.AbstractShiroFilter
nathandunn commented 7 years ago

JBrowse / volvox is using Alignment2, which we don't want to do. However, they are essentially the same.

nathandunn commented 7 years ago

The properties come in between Alignments.js and _AlignmentsMixin.js and they can be hidden in the configuration file.

nathandunn commented 7 years ago

Header properties:

[comments:[], sortOrder:unsorted, attributes:[VN=1.5], standardTags:[VN, GO, SO], programRecords:[], class:class htsjdk.samtools.SAMFileHeader, validationErrors:[], SAMString:@HD  VN:1.5
@SQ SN:ctgA LN:50001
, version:1.5, groupOrder:none, creator:null, textHeader:@SQ    SN:ctgA LN:50001
, sequenceDictionary:SAMSequenceDictionary:( sequences:1 length:50001  md5:3d62af8e60a6f9b772a2bcbcd2b5af92), readGroups:[]]

SAM Read properties:

[alignmentBlocks:[htsjdk.samtools.AlignmentBlock@f780326], duplicateReadFlag:false, mateAlignmentStart:0, start:19464, readNameLength:33, readGroup:null, readPairedFlag:false, mappingQuality:37, unclippedStart:19464, baseQualityString:2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222, readUnmappedFlag:false, binaryAttributes:htsjdk.samtools.SAMBinaryTagAndValue@5daa9, variableBinaryRepresentation:[99, 116, 103, 65, 95, 49, 57, 48, 52, 51, 95, 49, 57, 53, 54, 51, 95, 49, 58, 48, 58, 48, 95, 49, 58, 48, 58, 48, 95, 49, 56, 97, 48, 0, 64, 6, 0, 0, -127, 40, 40, 18, 33, 18, 20, -120, 17, 68, 20, 68, 33, 18, 17, 65, -126, 65, -124, 68, 66, 24, 36, -127, -127, 18, -120, 65, 24, 66, 33, -120, 68, 18, 17, 24, 17, -127, 24, -124, 40, 34, 18, 72, 34, 33, 17, 34, -126, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 88, 84, 65, 85, 78, 77, 67, 1, 88, 48, 67, 1, 88, 49, 67, 0, 88, 77, 67, 1, 88, 79, 67, 0, 88, 71, 67, 0, 77, 68, 90, 51, 49, 84, 54, 56, 0], readString:TACTCTACCAACAGTTAAGGAGGGCAACAAGATCGATGGGGCATCGTATAACTTGAATGCCATTGGACAAATAATAATTGCTCCACGTCCCAAACCTCAA, secondaryOrSupplementary:false, pairedReadName:ctgA_19043_19563_1:0:0_1:0:0_18a0, class:class htsjdk.samtools.BAMRecord, header:SAMFileHeader{VN=1.5}, valid:null, notPrimaryAlignmentFlag:false, inferredInsertSize:0, baseQualities:[17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17], readName:ctgA_19043_19563_1:0:0_1:0:0_18a0, readLength:100, referenceName:ctgA, readFailsVendorQualityCheckFlag:false, end:19563, contig:ctgA, cigarString:100M, attributes:[htsjdk.samtools.SAMRecord$SAMTagAndValue@6448454, htsjdk.samtools.SAMRecord$SAMTagAndValue@272215e6, htsjdk.samtools.SAMRecord$SAMTagAndValue@41b839b8, htsjdk.samtools.SAMRecord$SAMTagAndValue@e16739a, htsjdk.samtools.SAMRecord$SAMTagAndValue@2cd23028, htsjdk.samtools.SAMRecord$SAMTagAndValue@1b688d71, htsjdk.samtools.SAMRecord$SAMTagAndValue@1b1d125d, htsjdk.samtools.SAMRecord$SAMTagAndValue@262a2807], unclippedEnd:19563, readBases:[84, 65, 67, 84, 67, 84, 65, 67, 67, 65, 65, 67, 65, 71, 84, 84, 65, 65, 71, 71, 65, 71, 71, 71, 67, 65, 65, 67, 65, 65, 71, 65, 84, 67, 71, 65, 84, 71, 71, 71, 71, 67, 65, 84, 67, 71, 84, 65, 84, 65, 65, 67, 84, 84, 71, 65, 65, 84, 71, 67, 67, 65, 84, 84, 71, 71, 65, 67, 65, 65, 65, 84, 65, 65, 84, 65, 65, 84, 84, 71, 67, 84, 67, 67, 65, 67, 71, 84, 67, 67, 67, 65, 65, 65, 67, 67, 84, 67, 65, 65], mateReferenceIndex:-1, flags:16, alignmentEnd:19563, validationStringency:SILENT, SAMFlags:[READ_REVERSE_STRAND], cigar:100M, cigarLength:1, fileSource:null, readNegativeStrandFlag:true, attributesBinarySize:37, indexingBin:4682, mateReferenceName:*, SAMString:ctgA_19043_19563_1:0:0_1:0:0_18a0   16  ctgA    19464   37  100TACTCTACCAACAGTTAAGGAGGGCAACAAGATCGATGGGGCATCGTATAACTTGAATGCCATTGGACAAATAATAATTGCTCCACGTCCCAAACCTCAA 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222    X0:i:1  X1:i:0  MD:Z:31T68  XG:i:0  NM:i:1  XM:i:1  XO:i:0  XT:A:U
, referenceIndex:0, alignmentStart:19464, supplementaryAlignmentFlag:false, originalBaseQualities:null]
nathandunn commented 7 years ago

BAM/LazyFeature.js has the remaining mapping components out of the tags and values.

Also, the tag values for a samrecord look like this:

SAMRecord tag: X0 -> 1
SAMRecord tag: X1 -> 0
SAMRecord tag: MD -> 31C0C5G61
SAMRecord tag: XG -> 0
SAMRecord tag: NM -> 3
SAMRecord tag: XM -> 3
SAMRecord tag: XO -> 0
SAMRecord tag: XT -> U
nathandunn commented 7 years ago

Test instance: ctgA_19043_19563_1:0:0_1:0:0_18a0 at ctgA:19281..19910 (631 b)

nathandunn commented 7 years ago

BAM is returned with file and byte fields as well. I think those are already being absorbed by the API and are just used for reading.

nathandunn commented 7 years ago

https://samtools.github.io/hts-specs/SAMtags.pdf

https://samtools.github.io/hts-specs/SAMv1.pdf

nathandunn commented 7 years ago

Its unclear how the cigar strings are failing. They should be returning things of the nature: 100M . The BAM / File, LazyFeature.js, etc. is all store related data that we are ignoring anyway in favor of just alignments. It seems like somehow the subfeatures are just not being passed in properly during the parsing process.

screen shot 2017-07-29 at 11 53 18 am screen shot 2017-07-29 at 11 53 16 am screen shot 2017-07-29 at 11 52 55 am

We need to emulate this code to generate the subfeatures properly:

    _cigarToSubfeats: function(cigar)    {
        var subfeats = [];
        var min = this._get('start');
        var max;
        var ops = this._parseCigar( cigar );
        for (var i = 0; i < ops.length; i++)  {
            var lop = ops[i][1];
            var op = ops[i][0];  // operation type
            // converting "=" to "E" to avoid possible problems later with non-alphanumeric type name
            if (op === "=")  { op = "E"; }

            switch (op) {
            case 'M':
            case 'D':
            case 'N':
            case 'E':
            case 'X':
                max = min + lop;
                break;
            case 'I':
                max = min;
                break;
            case 'P':  // not showing padding deletions (possibly change this later -- could treat same as 'I' ?? )
            case 'H':  // not showing hard clipping (since it's unaligned, and offset arg meant to be beginning of aligned part)
            case 'S':  // not showing soft clipping (since it's unaligned, and offset arg meant to be beginning of aligned part)
                break;
                // other possible cases
            }
            if( op !== 'N' ) {
                subfeats.push(
                    new SimpleFeature(
                        {
                            data: {
                                type: op,
                                start: min,
                                end: max,
                                strand: this._get('strand'),
                                cigar_op: lop+op
                            },
                            parent: this
                        })
                );
            }
            min = max;
        }
        return subfeats;
    }
nathandunn commented 7 years ago

For the mismatch handling, its a bit interseting. Its reinterpreting the cigarString and the MD string, elements which should have already been re-interpreted.

Ultimately, what we want is something that draws the subclass spans with the classes, alignmismatch mismatch base[T|A|G|C]

nathandunn commented 7 years ago

Mismatch rendered as class:

screen shot 2017-08-01 at 12 45 21 pm screen shot 2017-08-01 at 12 42 08 pm
nathandunn commented 7 years ago

The render error (not getting now, though):

screen shot 2017-07-31 at 3 37 51 pm
nathandunn commented 6 years ago

Closing in favor of #1826