salbrec / seqQscorer

MIT License
22 stars 9 forks source link

Can seqQscorer worked with BWA index #2

Closed KhanhLPBao closed 3 years ago

KhanhLPBao commented 3 years ago

Hi, I'm using bwa for alignment, I don't know can the program run with genome index of BWA or not?

salbrec commented 3 years ago

Hi, thanks for your question!

Technically, it would be possible to use the mapping statistics provided by other mapping algorithms as long as it is possible to convert their mapping statistics into a file similar to the *.MAP files in the folder “feature_set_examples”. Because those files are parsed automatically by seqQscorer. Note, there are only paired-end examples, for single-end they have the same structure but less parameters. However, I do not recommend to do this.

Basically, all classification models used by seqQscorer were trained and validated using the alignments from Bowtie2. So within those training and validation datasets the mapping is consistent and it was also shown that models are generalizable to data from external (not ENCODE) data based on the Bowtie2 features. Hence, having the same type of features for your data, seqQscorer would also provide consistent scores describing the quality of your data no matter which aligner you use for other downstream analyses. Consequently, the best would be to do the alignment again with Bowtie2 and creating also the TSS and LOC features based on the Bowtie2 mappings. Then running seqQscorer with these features. The python wrapper “deriveFeatureSets.py” would do this for you and inside the docker there are all packages, the only thing you’d need is to download the appropriate Bowtie index as described in the README.

Don’t hesitate to ask if I can assist or if there are further questions! Best regards, Steffen