hammerlab / pileup.js

Interactive in-browser track viewer
Apache License 2.0
276 stars 63 forks source link

Having trouble viewing BAMs #457

Open emreatbina opened 7 years ago

emreatbina commented 7 years ago

I set up pileup.js in my own environment where I am serving BAM files using Nginx. However; I see this error in the js console. Do I need any specific headers sent by Nginx?


pileup.min.js:3 Uncaught TypeError: Unexpected value (BAM !== BAM).
    at Object.read (pileup.min.js:3)
    at s.<anonymous> (pileup.min.js:3)
    at s.<anonymous> (pileup.min.js:3)
    at s.h._action (pileup.min.js:3)
    at s.h.read (pileup.min.js:3)
    at s.<anonymous> (pileup.min.js:3)
    at s.<anonymous> (pileup.min.js:3)
    at s.h.inContext (pileup.min.js:3)
    at Object.read (pileup.min.js:3)
    at s.<anonymous> (pileup.min.js:3)
armish commented 7 years ago

Hi @emreatbina,

That exception seems to be raised by jbinary internal checks and looks like it is related to this: https://github.com/hammerlab/pileup.js/blob/master/src/main/data/formats/bamTypes.js#L71

According to the BAM specification, those 4 chars should start with a BAM that is followed by the unicode character SOH or \u001. Looks like your server is eating the unicode character and I assuming this has something to do with the way nginx is set up. GitHub doesn't show it but you can the issue here:

bam header

Would you mind double checking your nginx setup to make sure that it can serve unicode characters? If the problem is not that, I am curious about how that BAM file was generated — in which case, a sliced version to reproduce the problem would be really useful.

Let me know!

emreatbina commented 7 years ago

I tried the BAM files that you used in your demo as well and got the same error. In addition to Nginx, I tried a few Node.js servers (node-static and http-server). Made sure that the charset is UTF-8 for all of them. HTTP response headers that get returned in my setup and your demo are pretty much the same but my response content contains some unicode characters for some reason, and that's probably why I'm getting this error. However; I can download the BAM files from my HTTP server and view them just fine with samtools and IGV.js

armish commented 7 years ago

@emreatbina: hmm, that is really weird. We have been testing those parsing APIs so hard that it is sad to see this failure, but we first have to look for a few things that might be causing this behavior:

  1. If it is the nginx that is corrupting the header, then I would have expected the IGV and samtools to also complain about the missing unicode character. When you say those two can view the file without any problem, is it again through the nginx or are you talking about downloading them from the server and then viewing them locally?
  2. If this is not a global nginx configuration issue (e.g. if it forces the content that is proxied to be of a particular charset), then maybe we can blame the browser since sometimes the proxy/webserver lets the client pick the encoding. If you are familiar with the devtools of any browser, I would love to see the request/response metadata to better debug this. If not would you mind letting me know what this particular command produces on your end: $ curl $BAMURL -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C and if gzip is disabled then you can just remove the gunzip part of the pipe.
  3. As I mentioned, a sliced version of this BAM would be really helpful for us to debug and fix if this is an pileup issue. If you have the time, that would be really appreciated ;)

Thanks and sorry that you are having this issue.

emreatbina commented 7 years ago

Thanks for getting back. I'm sure we'll resolve this issue soon.

  1. curl http://localhost:8000/synth3.normal.17.7500000-7515000.bam | samtools view works fine. Same thing with IGV.js, I'm serving the BAM file with nginx and it can view it fine.

  2. curl http://localhost:8000/synth3.normal.17.7500000-7515000.bam -s -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C returns 00000000 42 41 4d 01 21 3e 00 00 40 48 44 09 56 4e 3a 31 |BAM.!>..@HD.VN:1| 00000010

curl http://www.hammerlab.org/pileup/test-data/synth3.normal.17.7500000-7515000.bam -s -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C returns 00000000 42 41 4d 01 21 3e 00 00 40 48 44 09 56 4e 3a 31 |BAM.!>..@HD.VN:1| 00000010

  1. Using the BAM file from your demo: http://www.hammerlab.org/pileup/test-data/synth3.normal.17.7500000-7515000.bam
armish commented 7 years ago

Sorry for the late reply, @emreatbina; but at this point, I have no idea what might be going wrong. The only possibility of getting such inconsistent responses from the server might be related to the client's request headers where your browser might be asking for something different and forcing to nginx server to use the wrong character encoding.

We haven't been testing the library against various combinations of browser settings, but this looks to me a like an edge use case. Let me know if you want to try to debug this on your side, but otherwise, I will probably won't be able to take a look at it in the foreseeable future :(