COMBINE-lab / RapMap

Rapid sensitive and accurate read mapping via quasi-mapping
GNU General Public License v3.0
89 stars 23 forks source link

Support different output formats #5

Closed rob-p closed 8 years ago

rob-p commented 9 years ago

Right now the output is in a format that was just simple for me (Rob) to test. We should look into outputting 2 different formats:

  1. An efficient binary format for potential use with tools that don't want to read in a full mapping file. This format should be as concise as possible to minimize communication overhead --- potentially just a vector of some standard form of hit objects.
  2. A variant of the pseudobam format that is supported by Kallisto. This is essentially SAM format, but discarding the information you don't get via lightweight / psedo-alignment. This format would be useful for other downstream analysis to e.g. look at the mappings with other tools.
rob-p commented 8 years ago

OK --- I've decided that 1. should not be an output format, but rather we should expose a library API of RapMap for other tools to use. This would allow other tools to just call e.g. rapmapper.getHits(read, hitVec) and have the library fill in the contents of the alignments.

There is currently a very preliminary implementation of 2. It looks "reasonable", but the SAM spec is not always easy to interpret, so this will need some serious testing and probably bug fixing.

rob-p commented 8 years ago

Since the SAM output currently works (obviously needs more testing), I'm closing this issue. Issues related to problems with the SAM output will be different tickets.