dcjones / quip

Compressing next-generation sequencing data with extreme prejudice.
http://www.cs.washington.edu/homes/dcjones/quip/
BSD 3-Clause "New" or "Revised" License
78 stars 10 forks source link

Problem_in_quip_file_decompression #15

Closed StefanoCast closed 11 years ago

StefanoCast commented 11 years ago

Hello, I am running Quip package for BAM files compression/decompression. I get this error: "Unknown SAM field type: ", when I run this command "quip --input=quip --output=bam -r human_hg19_chr.fa input.qp". The "input.qp" derives from a quip compression of a *.BAM file obtained by the alignment of SOLiD fragment reads with Lifescope software. Running the "quip --test" option on my "input.qp", I obtain the same error. Any suggestion? Thanks.

dcjones commented 11 years ago

It seems that Lifescope uses BAM in an interesting way that I'm not supporting. Can you point me to some data generated from Lifescope, or just copy/paste a few SAM lines (e.g. the output of samtools view input.bam | head)?

StefanoCast commented 11 years ago

In the attachment, you can find the first lines of one of our Lifescope *.bam files. Thanks, Stefano

2013/2/20 Daniel Jones notifications@github.com

It seems that Lifescope uses BAM in an interesting way that I'm not supporting. Can you point me to some data generated from Lifescope, or just copy/paste a few SAM lines (e.g. the output of samtools view input.bam | head)?

— Reply to this email directly or view it on GitHubhttps://github.com/dcjones/quip/issues/15#issuecomment-13850622.

Stefano_Castellana 689_319_1407 99 chr1 14826 9 25M50H = 14916 114 ATGCCTGGAGGGAAAAGGCTGANNN JJJJJJJJJJJJJJJJJJJJJ;$(' XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:2 NM:i:0 CQ:Z:@@?6/=@82;=@@/?@8@26@/<6/6/@@8/@6.88./6/86@@@2-2@//66@:;.;6/<=8?-./.6../@ CS:Z:T331302102200200020321210112301200030020021210020011221001101100000000203211 451_492_406 99 chr1 14914 46 75M = 15037 147 GGCAGGACAGAATTACGAGATGCTGGCCCAGGGCGGGCAGCGGCCCTGCCNCCTACCCTTGNNCCTCATGNNCAG JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJGJJJJJEEB+23<ACJJGA)4;==DJJGC4%(//32204)122 XC:Z:AAA RG:Z:A13 NH:i:1 CM:i:10 NM:i:2 CQ:Z:@@@@@8@@@@?@@@@@8@@@@@@8@@>@@;-2@6;/@>0;/?@>68@@-.?@A/0?<@2//@/2?@@822;? CS:Z:T103120211220303132223132103001200330031233330021302002310020112202213111012 581_1841_1065 163 chr1 14920 51 25M = 14982 131 ACAGAATTACGAGGTGCTGGCCCAG JJJJJJJJJJJJJJJJJJJJJJJJJ XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:2 NM:i:1 CQ:Z:=@@@@@>@8@@@?@?@@<@@@??@@ CS:Z:T3112203031322011321030012 457_1625_1057 99 chr1 14926 48 75M = 15061 159 TTACAAGGTGCTGGCCCTCGGCGGGCAGCGNNNNNNNNNCCTACCCTTGCGCCCNNTNACCAGCTTGTTNAAGAG JJJJJJJJJJJJJJJJJ728GGE==FFE@,&&&&%###%488BB63;+-699A3))1%0GJJJJJJJD?(+-341 XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:10 NM:i:3 CQ:Z:@@6@@@@6@@@@@@@<@/@2@2--?@?.-@:<52@;-.?@2;08@-902@@/2;@/23@?826@=:?-/;26- CS:Z:T003110201132103002230330031233130020302002310021133300013021012320110100222 450_1084_1189 163 chr1 14930 43 25M = 15019 151 GAGGTGCTGGCCCAGGGCGGGCAGC JJJJJJJJJJJJJJJJJJJJJJJJ6 XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:2 NM:i:1 CQ:Z:@@@@@@@@@@@>@@@@@@80@@<@/ CS:Z:T1220113210300120033003123 669_678_1906 99 chr1 14933 38 62M13H = 15095 186 GTGCTGGCCCAGGGCGGGCAGCGGCCCTGCCTNCTACCCTTNNNNNTCATGACCAATTTGTT JJJJJJJJJJJJJJJJJJJ>2-,0128>>>,&@CJJC66&$'''//AEFHFDG=;5BC<; XC:Z:AAA RG:Z:A13 NH:i:1 CM:i:8 NM:i:2 CQ:Z:>>@@@8;@8@=@-:862>.320@@0-.?@/.?8>2-@@.--=@;_6@=6.66/06/@-972=/-2/<.-./< CS:Z:T111321030012003300312332300213023023100200003020131210103001102000232311121 628_137_1184 99 chr1 14934 46 71M4H = 15061 151 TGCTGGCCCAGGGCGGGCAGNGGCCNTGCCTCCTANCCTTGCGCCTCATGNCCAGNTTGTTAAAGAGATCC JJJJJJJJJJJJJJJJJ55+%-//7#>@CA97556%//////:;;>=54:&////'4@@?H8@@GF@JJF9 XC:Z:AAA RG:Z:A13 NH:i:1 CM:i:8 NM:i:1 CQ:Z:/@@8@<</?;/</.;--/@;=622?8-/<@@/_6@;_2/@=.8@//2?/;--@/8/<62.@\ CS:Z:T013210300120033003122303032130220232002010330221311101212011030022223201210 468_573222 99 chr1 14937 53 75M = 15049 136 TGGCCCAGGGCGGGCAGCGGCCCTGCCTCCTACTCTTGCGCCTCATGACCANCTTGTNGAAGAGATCCGACATCA J:1:BFHHJJJJJJJJJJJJJJGJJJJJJF>>155=@JJJJJJJJJJB>46+1116)9FJJJEJJJJJJJGE>= XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:6 NM:i:1 CQ:Z:@@2@@@@@-;@@.A:@@@A@@0/=@@:;@@?.=@@@9A@@?2/@@?2868-:/?;/6@@8/.668/=00/< CS:Z:T013300120033003123303002130220233222013330221312101132011112022223203211321 581_18411065 83 chr1 14982 51 5H70M = 14920 -131 TGACCAGCNTGTTGAAGAGATCCGACNNNAAGTGCCCACCTTGGCTCGTGGCTCTCACTGCAACGGGAAA 47>B@;5-$+56==JJJJAAJJCFEE*<CDJJJJJJJ=EGEGJJJJJJJJJJJJJJJJJJJJJJJJJJ XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:2 NM:i:0 CQ:Z:@@22@@?>@?@?2@@?@>6@6>28@80//2-?;;8-:/;8**:A6-/8---@6@//;/.///-./_/6__/. CS:Z:T000200310131211222230113223010201100311201331123023222202101112321012100201 450_10841189 83 chr1 15019 43 12H63M = 14930 -151 ACCTTGGNAGGTGGCTCTCACTGCAACGGGAAAGCCACAGACTGGGGTGAAGAGTTCAGTCAC >>EEB43./0,DJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ XC:Z:AAA RG:Z:A1_3 NH:i:1 CM:i:2 NM:i:2 CQ:Z:@@@@@@@@@@@@@@@@@@@@@@@@8@@?@@@@@@@@@@@>@@@@@@@@@<@28;?@-@6<@/@/@@///2@2//6 CS:Z:T111212120122202110001212211103200200310131211222230110222010201000010201031

dcjones commented 11 years ago

Thanks for sharing this with me. Try as I might, I haven't been able to reproduce this problem yet.

Would you be willing to share a larger sample with me?

StefanoCast commented 11 years ago

Sorry for the delay in my answer. You can download a sample .bam file from herehttps://docs.google.com/a/css-mendel.it/file/d/0BxaHJSxaSzhlYmNFX2lsSHlMSmM/edit?usp=sharing. This was produced by mapping SOLiD fragment reads with Lifescope Software: file contains .bam header plus 'chromosome 1' alignment data. I hope it may be useful.

2013/2/27 Daniel Jones notifications@github.com

Thanks for sharing this with me. Try as I might, I haven't been able to reproduce this problem yet.

Would you be willing to share a larger sample with me?

— Reply to this email directly or view it on GitHubhttps://github.com/dcjones/quip/issues/15#issuecomment-14196034 .

Stefano_Castellana

dcjones commented 11 years ago

Thanks again for all the help tracking this down. There was a subtle bug that occurred in SAM/BAM files that have a lot of optional fields. It's now fixed in version 1.1.6.

StefanoCast commented 11 years ago

Thanks for your effort. We will test your new Quip version in the next future.