Open lpryszcz opened 1 year ago
Sonia just pointed that reads per barcode are reported in QC output. It would be nice to include probabilities and confidence interval in those files.
> head QC_files/fast5---bc_1_final_summary.stats
filename read_id
batch0.fast5 17396d2c-2693-482a-8023-2d59eba90cbf
batch0.fast5 0d2e699f-c1a5-4841-a0fb-f5aa22008dba
batch0.fast5 094d1ef3-fdb7-43de-97c4-8c366e9dc238
batch0.fast5 2a2aca8e-77a2-4de1-bb04-3756587e234d
batch0.fast5 147db6be-2f9d-4416-b1e1-dde20837a222
batch0.fast5 0f001057-386b-4e76-876c-ac15c01f4176
Hi Leszek, where is this information that we need to publish to the final output?
@soniacruciani
deeplexicon output (tsv file) has several columns, we'd want to include Confidence Interval
, but I think we could just copy entire file as is. example below.
fast5 ReadID Barcode Confidence Interval P_bc_1 P_bc_2 P_bc_3 P_bc_4
FAQ43205_ae8483a4_0.fast5 00084a6d-553e-4d09-936b-94d5f1b23007 bc_4 0.3033 0.02900 0.05911 0.30428 0.60762
FAQ43205_ae8483a4_0.fast5 0037e932-b453-458a-8ad7-6f6c399c2922 bc_3 0.9993 0.00001 0.00032 0.99966 0.00001
FAQ43205_ae8483a4_0.fast5 003d0fb7-6c5e-45b1-8a79-ab78e0a7a6dc bc_3 0.9757 0.00528 0.00934 0.98500 0.00038
FAQ43205_ae8483a4_0.fast5 0041152a-40cc-487e-be83-aa5f9d33c96a bc_3 0.9871 0.00079 0.00596 0.99310 0.00016
FAQ43205_ae8483a4_0.fast5 005975a1-65d6-4bef-85da-f90e048d30f8 bc_3 0.9288 0.02709 0.00667 0.95594 0.01030
FAQ43205_ae8483a4_0.fast5 005abdfe-4723-40b3-ad2c-c578d20c6c58 bc_3 0.9993 0.00000 0.00032 0.99965 0.00003
Hi Luca, could we save deeplexicon table (.tsv) in the final output?