humanlongevity / HLA

xHLA: Fast and accurate HLA typing from short read sequence data
Other
101 stars 52 forks source link

Read length for xHLA #66

Open serge2016 opened 12 months ago

serge2016 commented 12 months ago

Dear all, According to this (https://github.com/humanlongevity/HLA/blob/master/bin/preprocess.pl#L7) line, xHLA needs reads of length 70 nucleotides or more. Does this mean that we should not run xHLA on PE Illumina reads 2x50?

serge2016 commented 12 months ago

https://www.pnas.org/doi/10.1073/pnas.1707945114

Preprocessing. The input data of xHLA is a BAM file where sequencing reads are mapped to the hg38 human reference assembly (excluding alt contigs). Both BWA’s mem mode (Version 0.7.15) (20) and Isaac (Version 0.14.02.06) (21) with default parameters work well with xHLA on diverse datasets. Because all genome sequencing projects produce a BAM file, the alignment step is not considered as part of xHLA. xHLA extracts relevant HLA reads from the BAM file (chromosome 6, position 29,886,751–33,090,696), then trims and filters them based on base quality scores. Trimming is based on BWA’s trimming algorithm with Phred quality cutoff 20 from the 3′ end after first trimming Ns. Reads <70 bp or with more than five positions with Phred quality score <4 after trimming are removed.