Closed gnarzisi closed 6 years ago
Hi Giuseppe, Thanks for reporting, this does look like a bug but hopefully one that won’t take too much to fix. I’ll post here again when I figure out what it going on.
On Apr 25, 2018, at 11:47 AM, Giuseppe Narzisi notifications@github.com wrote:
Thank you Jeremiah for working on this tool. I am experiencing some strange behaviors and I wanted to give you a heads up.
I would like to collect stats based on different tags (e.g., MI, PS, etc.) but, independently of what I specify in the --tag option, I always obtain the same output (stats based on BX).
Here is the simple command I used (just for chr22):
samtools view -h $BAM chr22 | bxtools stats - -t MI > stats.MI.tsv
The reads do contain the other tags. I used the BAM file provided by the GIAB team. Below is an illustrative example of a read containing all the tags.
Another strange behavior I noticed is that the values reported in output for the AS column are all 0s. This seems odd since the majority of the reads have AS values different from 0.
ST-E00273:177:HMTTCCCXX:1:2120:6806:23477 105 chr22 10510039 60 101M27S chr21 8532882 -347 ATGTTTGGAATATAAAATCAGCAACTAATATGTATTTTCAAAGCATTATCAATACAGAGTGCTAAGTGACTTCACTGGGAAAGGTAGTCATATAAAGAACAGACTAATAGTCCGGGATTATTGTGAGG <<F,7AFKF,F,,F,FKFAFK7AAFKFFKKFF,,<F7,7,,,<AK,,<,,7,A,,F,,77AF,7FFK7,,,AKA<,,,7,,7,,,AFF,F,F<FAKFKA,,,,7,,,,,,,7,,(A<AK,,<7,,<,, DM:Z:1.236364 QT:Z:A<,F<FFA BC:Z:TCACATCA QX:Z:,AAF,<FFFFKFKKA< AM:A:1 XM:A:0 TR:Z:TAGTCGC TQ:Z:FKA,FKK AS:f:-93 RG:Z:27058:MissingLibrary:1:HMTTCCCXX:1 XS:f:-94 BX:Z:TGAATCGCAACTGGAG-1 XT:i:0 RX:Z:TGAATCGCAACTGGAG OM:i:60 PS:i:10464994 HP:i:2 PC:i:26 MI:i:28314638 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
The tag option wasn't being correctly read in by bxstats and there was an issue with the float tags as well. A test run on the BAM you link to appears to be working correctly.
e.g.
28358269 66 398 19 -5
Let me know if other issues, but a recursive update of the git repos should fix these problems.
Everything seems to be working fine now. Thanks you for the quick fix!
I have another functionality suggestion: similar to the "mol" subcommand, it would be very useful to have a "phase-set" subcommand that generates a BED file with the minimum footprint for each phase set. In this case, multiple barcodes will be associated to each phase set and they could be reported as a comma-separated list.
I just updated the repos to give mol
the same tag-choice option (-t
) as other commands. Would bxtools mol -t PS
get you mostly what you need, aside from an extra BED field tracking which BX codes belong to which phase-sets?
That would be good enough for now.
I tried the new code with the -t PS option, but it does not seem to be ready. The output still seems to same as for the MI tag.
I just updated with this functionality. It was already tracking the BX
tags, so I needed to print them. The -t
option should now work in mol
(see below example BED output):
Great! Thank you Jeremiah. Closing the ticket ;)
Thank you Jeremiah for working on this tool. I am experiencing some strange behaviors and I wanted to give you a heads up.
I would like to collect stats based on different tags (e.g., MI, PS, etc.) but, independently of what I specify in the --tag option, I always obtain the same output (stats based on BX).
Here is the simple command I used (just for chr22):
samtools view -h $BAM chr22 | bxtools stats - -t MI > stats.MI.tsv
The reads do contain the other tags. I used the BAM file provided by the GIAB team. Below is an illustrative example of a read containing all the tags.
Another strange behavior I noticed is that the values reported in output for the AS column are all 0s. This seems odd since the majority of the reads have AS values different from 0.