afombravo / 2FAST2Q

A Python3 program that counts sequence occurrences in raw FASTQ files.
GNU General Public License v3.0
7 stars 2 forks source link

pair-end and single-end file #6

Closed vuhongai closed 2 years ago

vuhongai commented 2 years ago

Hello, thank you for sharing. It's really fast indeed. I have 1 question: how can I specify if the fastq format is from single-end or pair-end configuration. If I have R1 and R2 files from pair-end reads, I should count it individually or there is better way to do that? Thank you. Ai

afombravo commented 2 years ago

Hi. Thank you for your coment. Currently 2fast2q doesn't support pair ended format, as this would required more advanced alignment algorithms like the ones found in bowtie2, for example. For your case, the best you can do is indeed run it twice, keeping in mind that you will probably need a filé with the reverse complement of the features you are trying to align to. If you describe your use case, and if you want, I can help you further. Cheers, Afonso

vuhongai commented 2 years ago

Hi, thank you for the quick reply. In my experiment, I deep sequence a PCR product of a concatenation of 3 16-bp barcodes. The amplicon length including the PCR primer is 120 bp. I did the paired-end sequencing of those and would like to count individual variants. I think I would do as you suggested, do it separately and try to merge results of the reverse-complement sequences.

vincentdebakker commented 8 months ago

Late to this, but just FYI also for future questions on the topic: faced with the same issue I have previously merged the reads prior to 2FAST2Q counting with a tool called PEAR: https://cme.h-its.org/exelixis/web/software/pear/. Worked like a charm.