Open nathanweeks opened 6 years ago
You might also consider BigBed format which keeps all info on the same line. The gtf2bed script https://github.com/dasmoth/gtf2bed and this PR can go some ways towards accomplishing this https://github.com/GMOD/jbrowse/pull/944
Hi @cmdcolin, thanks for the heads up regarding the forthcoming BigBed support in JBrowse. BED files appear to lack support for arbitrary key/value pairs like the GFF3 9th column, and our use case is to display interspecific gene model primary transcript alignments with their functional annotation.
Until GFF3 Gap attribute support is available in JBrowse, it seems like modifying our processing pipelines to handle SAM / output BAM might be the most realistic alternative.
It's not necessarily true that BED file doesn't allow arbitrary key value pairs. The bed file "autosql" concept allows arbitrary information encoding, and this can be converted from other formats using the gtf2bed program mentioned before
See https://github.com/dasmoth/gtf2bed/blob/master/gencode.as as an example
If it is easier to support GFF3 Gap, I am all for that too. I just thought I'd point out the possible alternative :)
Thanks for the pointer regarding AutoSQL support in BigBed; I had missed that. Will https://github.com/GMOD/jbrowse/pull/944 add support for that as well?
@nathanweeks now there is some basic autosql support :)
@cmdcolin : cool, looking forward to trying it out!
Just to add a resource link which may be useful:
It would be convenient if JBrowse could render spliced alignments from tabix-indexed GFF3 files using the CIGAR string in the GFF3 Gap attribute (e.g., produced from gmap's gff3_match_est format), rather than requiring a less-storage-efficient approach of having a separate GFF3 line for each aligned segment.