wdecoster / nanocomp

Comparison of multiple long read datasets
MIT License
113 stars 9 forks source link

Different timestamp formats causes crash #57

Closed schorlton closed 2 years ago

schorlton commented 2 years ago

Hi @wdecoster,

Thanks for the NanoPack tools, they're fantastic!

Hit an error as it looks like different nanopore basecallers or versions upstream can produce different time formats. The error is: TypeError: unsupported operand type(s) for -: 'str' and 'str' and can be recreated by supplying the following files:

head -4 input1.fastq       
@128da40c-750e-4d27-904a-3ffb06c8fdf4 runid=971be673fab48aaebd2838d612e17ffcbae04067 read=207 ch=391 start_time=2022-07-05T11:05:13.137461-07:00 flow_cell_id=FAT83688 protocol_group_id=2022_07_05 sample_id=no_sample barcode=barcode10 barcode_alias=barcode10 parent_read_id=128da40c-750e-4d27-904a-3ffb06c8fdf4 basecall_model_version_id=2021-05-17_dna_r9.4.1_minion_96_29d8704b
AAAA
+
++++

head -4 input2.fastq
@7ffda42b-7bec-465c-9f10-3534d6ec9ce1 runid=99a468dca48c7585428d83a664c4db82f18a2d26 sampleid=no_sample read=32340 ch=82 start_time=2022-06-24T05:37:18Z model_version_id=2021-05-17_dna_r9.4.1_minion_384_d37a2ab9
TTTT
+
''''

NanoComp -t 1 --fastq_rich input1.fastq input2.fastq --names test1 test2

Sam

wdecoster commented 2 years ago

Thanks for the detailed report, I will have a look

wdecoster commented 2 years ago

I have fixed this in nanoget v1.18.0, which is now available on PyPI (and soon on conda).

schorlton commented 2 years ago

Thank you :tada: