mckennalab / FlashFry

FlashFry: The rapid CRISPR target site characterization tool
Other
63 stars 10 forks source link

Score: TSV columns off-by-one? #32

Closed photocyte closed 2 years ago

photocyte commented 2 years ago

Hi there,

I'm using FlashFry score, and the output TSV seems to have an issue where the header columns past "dangerous_GC" don't seem to match up w/ the data values. This is via opening the TSV with Excel & Pandas. I checked briefly in the TSV with a hexeditor and saw there was a double tab character in the data values near "dangerous_GC", but not in the header, so there is a mismatch with the columns in the header, and columns and the data

Here was my FlashFry execution.

jar=FlashFry-assembly-1.13.jar
java -Xmx4g -jar ${jar} score\
              --database ${db}\
              --scoringMetrics hsu2013,doench2014ontarget,doench2016cfd,moreno2015,bedannotator,dangerous,minot,reciprocalofftargets,rank \
              --input ${discover_output} \
              --output tmp/scored_${discover_output}

head of the relevant TSV is attached, and its result when I open it in Excel (showing columns offset) head_scored_discover_Scaf17_hotspots_1000.fasta.output.tsv.zip

head_scored_discover_Scaf17_hotspots_1000.fasta.output.xlsx

aaronmck commented 2 years ago

Thanks for the report @photocyte! I think this is a bug exposed by including the bedannotator scoring metric without a BED file parameter. The code then wrongly creates a spot for the output in the per-target output section but leaves it blank. I'll put in some better checks that the bedannotator parameters are included before scoring. Could you check this out by not including bedannotator in the scoringMetrics parameter and see if this fixes it? I'll work on an update as well.

photocyte commented 2 years ago

Yep! Can confirm, removing bedannotator from the --scoringMetrics does fix the TSV off by one issue. Thanks :) Hadn't thought too hard about the scoring parameters & makes sense that bedannotator would need a bed file.

aaronmck commented 2 years ago

Great! I have some new protections in place to catch this. I'll put it into the next release (just waiting on the resolution of another issue, should be soon).