maiziex / Aquila_stLFR

Human haplotype-resolved assembly and variant detection for stLFR, hybrid assembly for linked-reads
MIT License
8 stars 3 forks source link

Barcode format #4

Open Richard-Luepken opened 2 years ago

Richard-Luepken commented 2 years ago

Hello,

i am trying to run Aquila_stLFR on some stLFR data. As far as i understand the Aquila_stLFR_fastq_preprocess script requires the reads to be part of the read_id in both reads in pair. In the data we obtained the barcode however is still part of the read. Can you tell me the format for the barcode and the whole read_id that is required by your preprocessing step?

i.e.: AAGTAGAATG-AAGTAGAATG-AAGTAGAATG or _AAGTAGAATG_AAGTAGAATGAAGTAGAATG or AAGTAGAATGAAGTAGAATGAAGTAGAATG

For the seq_id i assume:

_seqname#barcode/1 _seqname#barcode/2

is that correct?

Best regards, Richard

maiziex commented 2 years ago

you can check the test datset we provided in zenodo: https://zenodo.org/record/5032380#.Y0MPi-zML8w