First, thank you for providing this great software.
I noticed that after issue #532 , RG tag and some other tag is automatically added to the FASTQ header, which I think is a bit irrational for
The tag, especially RG tag is long, which will take up more disk space and bad for device like Mk1C with increased IO burden. If I want the full info to be well documented, BAM is absolutely better.
If I basecalled my own data with --emit-fastq (which is not recommend), I would know the model in Stdout or I manually set it, so extra recording won't help.
If I upload to SRA database to share my data, SRA will re-encode my header after fastq-dump, and the original long header will be useless.
I fully recognize the previous demands like minimap2 -y, but I think that is minor for the vast majority, since dorado can do the alignment. If I want fastq format, I will want the header to be neat and fast, so I suggest maybe make this function to be optional or leave this function to other 3rd party software.
Hi devlopers of Dorado,
First, thank you for providing this great software.
I noticed that after issue #532 ,
RG
tag and some other tag is automatically added to the FASTQ header, which I think is a bit irrational forRG
tag is long, which will take up more disk space and bad for device likeMk1C
with increased IO burden. If I want the full info to be well documented,BAM
is absolutely better.--emit-fastq
(which is not recommend), I would know the model in Stdout or I manually set it, so extra recording won't help.If I upload to SRA database to share my data, SRA will re-encode my header after
fastq-dump
, and the original long header will be useless.I fully recognize the previous demands like
minimap2 -y
, but I think that is minor for the vast majority, sincedorado
can do the alignment. If I wantfastq
format, I will want the header to be neat and fast, so I suggest maybe make this function to be optional or leave this function to other 3rd party software.