COMBINE-lab / RapMap

Rapid sensitive and accurate read mapping via quasi-mapping
GNU General Public License v3.0
89 stars 23 forks source link

SAM format issue #38

Closed darachm closed 7 years ago

darachm commented 7 years ago

Minor minor issue. When using samtools view on the RapMap output, I came across an error that SEQ was not of equal length to QUAL. Reading more, it seems that the SAM specification states for field 11:

This field can be a ‘*’ when quality is not stored. If not a ‘*’, SEQ 
must not be a ‘*’ and the length of the quality string ought to 
equal the length of SEQ.

So I perl'd a '*' into the 11-th position of the SAM file, and samtools (v1.2 using htslib 1.2.1 on arch linux) now converts it nicely to bam.

Am I right to think that your tool ought to stick a '*' in that position?

By the way, your program is obscenely fast.

rob-p commented 7 years ago

Hi @darachm,

You are right to think that, in this case, we out to stick a '' in that position. Actually, the branch of RapMap on which most recent development has been taking place i.e., develop-salmon, already uses a `for *all* quality strings (since RapMap is providing a mapping instead of an alignment anyway, and not writing out the quality string saves a considerable amount of space in the resulting SAM / BAM file). It's probably about time that I merge the changes and improvements from that branch back into master anyway, so I'd be happy to do that and cut a new release when I have a chance (in the next couple of days). Is the quality string something you would need, or would you be OK with simply having a*` in that field anyway?

Best, Rob

P.S. Thanks for the speed comment --- that's what we're aiming for.

darachm commented 7 years ago

For my use, I'm happy with a *. It saves a me a one-liner.

Thanks,

Darach

rob-p commented 7 years ago

Hi @darachm,

I've tagged a new release. Can you see if this works for you? If so, I will close this issue.

Best, Rob

darachm commented 7 years ago

Well, I'm trying but my system is a bit of a mess. That release is failing to link giving me something like:

/usr/bin/ld: /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.1/../../../../lib/libz.a(gzlib.o): \
relocation R_X86_64_32 against `.rodata.str1.1' can not be used when \
making a shared object; recompile with -fPIC

Even though there's a -fPIC in the triggering command, so not sure what's going on there.

I didn't tell you previously, but I had to delete a few error-checking lines in syslimits.h, stdlib.h, and wchar2.h to get it to compile. I'm going to see if I can get it cleanly working, maybe I've got something broken on my end.


Re: the issue at hand, I believe it's resolved as long as RapMap outputs a * when there's no qualities to output.

darachm commented 7 years ago

I can't compile from source, but using the binaries I can do the mapping and the sam output is good (samtools reads it without complaints), so that issue is fixed.

Re: not installing, that's beyond the scope of this issue, but I'm getting this as an error:

/path/to/directory/RapMap-0.5.0/include/spdlog/fmt/bundled/format.h:2198:24: error: expected unqualified-id before numeric constant
     const unsigned CHAR_WIDTH = 1;

If you have any pointers, or if this might be a useful case for development, please let me know. Until then, I'll just us the binaries.

rob-p commented 7 years ago

I'd definitely like to solve compiling from source for you. But you're right; this isn't the ideal place for that issue. Can you open a new issue with this description? And in that issue can you provide details about your system (e.g., which OS, compiler version etc.). Since the actual SAM format issue is solved, I'll close this issue.

blahah commented 7 years ago

Can I suggest that @darachm you might find compiling from source easier by replicating the CI environment that Salmon uses? At a guess I'd say install docker, docker pull hbb:salmon_build, then run the /done/build.sh script inside a container. @rob-p you could take the effort you've already put into setting up the CI here and turn it into a short set of instructions for consistent building.

darachm commented 7 years ago

About this though, I was just looking to fiddle around with RapMap because it was new. I don't see myself contributing to RapMap development, so now I'm thinking that I'd be better served by just using Salmon. I'll invest my time in getting that working, and see how that compares.

Thanks y'all.

rob-p commented 7 years ago

Hi @darachm ,

M sure, if you're interested in txp quant, Salmon is the place to be (they use the same quasi-mapping algorithm under the hood anyway). Regardless, would you mind at least reporting you're build environment (OS & version and compiler & version) so we can explore on our own and so we know if future rapmap users report similar issues?

Thanks! Rob