Open emreatbina opened 7 years ago
Hi @emreatbina,
That exception seems to be raised by jbinary
internal checks and looks like it is related to this:
https://github.com/hammerlab/pileup.js/blob/master/src/main/data/formats/bamTypes.js#L71
According to the BAM
specification, those 4 chars should start with a BAM
that is followed by the unicode character SOH
or \u001
. Looks like your server is eating the unicode character and I assuming this has something to do with the way nginx is set up. GitHub doesn't show it but you can the issue here:
Would you mind double checking your nginx setup to make sure that it can serve unicode characters? If the problem is not that, I am curious about how that BAM file was generated — in which case, a sliced version to reproduce the problem would be really useful.
Let me know!
I tried the BAM files that you used in your demo as well and got the same error. In addition to Nginx, I tried a few Node.js servers (node-static and http-server). Made sure that the charset is UTF-8 for all of them. HTTP response headers that get returned in my setup and your demo are pretty much the same but my response content contains some unicode characters for some reason, and that's probably why I'm getting this error. However; I can download the BAM files from my HTTP server and view them just fine with samtools and IGV.js
@emreatbina: hmm, that is really weird. We have been testing those parsing APIs so hard that it is sad to see this failure, but we first have to look for a few things that might be causing this behavior:
nginx
configuration issue (e.g. if it forces the content that is proxied to be of a particular charset), then maybe we can blame the browser since sometimes the proxy/webserver lets the client pick the encoding. If you are familiar with the devtools of any browser, I would love to see the request/response metadata to better debug this. If not would you mind letting me know what this particular command produces on your end: $ curl $BAMURL -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C
and if gzip is disabled then you can just remove the gunzip
part of the pipe.Thanks and sorry that you are having this issue.
Thanks for getting back. I'm sure we'll resolve this issue soon.
curl http://localhost:8000/synth3.normal.17.7500000-7515000.bam | samtools view
works fine. Same thing with IGV.js, I'm serving the BAM file with nginx and it can view it fine.
curl http://localhost:8000/synth3.normal.17.7500000-7515000.bam -s -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C
returns
00000000 42 41 4d 01 21 3e 00 00 40 48 44 09 56 4e 3a 31 |BAM.!>..@HD.VN:1| 00000010
curl http://www.hammerlab.org/pileup/test-data/synth3.normal.17.7500000-7515000.bam -s -H 'Range: bytes=0-3917' | gunzip | hexdump -n 16 -C
returns
00000000 42 41 4d 01 21 3e 00 00 40 48 44 09 56 4e 3a 31 |BAM.!>..@HD.VN:1| 00000010
http://www.hammerlab.org/pileup/test-data/synth3.normal.17.7500000-7515000.bam
Sorry for the late reply, @emreatbina; but at this point, I have no idea what might be going wrong. The only possibility of getting such inconsistent responses from the server might be related to the client's request headers where your browser might be asking for something different and forcing to nginx server to use the wrong character encoding.
We haven't been testing the library against various combinations of browser settings, but this looks to me a like an edge use case. Let me know if you want to try to debug this on your side, but otherwise, I will probably won't be able to take a look at it in the foreseeable future :(
I set up pileup.js in my own environment where I am serving BAM files using Nginx. However; I see this error in the js console. Do I need any specific headers sent by Nginx?