tsailabSJ / circleseq

GNU Affero General Public License v3.0
22 stars 19 forks source link

Allow manifest parameter to support different read lengths #28

Open shengdar opened 7 years ago

shengdar commented 7 years ago

Update manifest to support different read lengths (for example paired end 75 instead of 150)

Akitsu505 commented 6 years ago

I would love to have this read length option, as 75bp paired-end by NextSeq is much more cost effective than MiSeq.

For now, I suspect I may change the parameters on reading cigar in "findCleavageSites.py" file?

I really appreciate if you could support this by manifest option.

"Current findCleavageSites.py"

            for cigar_operation in read.cigar:
                # Identify positions that end in position 151 and start at position 151
                # Note strand polarity is reversed for position 151
                if cigar_operation.type == 'M':
                    if ((cigar_operation.query_from <= 146 - start_threshold) and
                            (151 - start_threshold <= cigar_operation.query_to)):
                        first_read_cigar = cigar_operation
                        first_read_chr = cigar_operation.ref_iv.chrom
                        first_end = min(cigar_operation.query_to, 151)
                        distance = first_end - cigar_operation.query_from
                        first_read_position = cigar_operation.ref_iv.start + distance - 1
                        first_read_strand = '-'
                    if ((cigar_operation.query_from <= 151 + start_threshold) and
                            (156 + start_threshold <= cigar_operation.query_to)):
                        second_read_cigar = cigar_operation
                        second_read_chr = cigar_operation.ref_iv.chrom
                        second_end = max(151, cigar_operation.query_from)
                        distance = second_end - cigar_operation.query_from
                        second_read_position = cigar_operation.ref_iv.start + distance
                        second_read_strand = '+'

"Something like this for 75bp paired reads?"

        for cigar_operation in read.cigar:
            # Identify positions that end in position 76 and start at position 76
            # Note strand polarity is reversed for position 76 
            if cigar_operation.type == 'M':
                if ((cigar_operation.query_from <= 71 - start_threshold) and
                        (76 - start_threshold <= cigar_operation.query_to)):
                    first_read_cigar = cigar_operation
                    first_read_chr = cigar_operation.ref_iv.chrom
                    first_end = min(cigar_operation.query_to, 76)
                    distance = first_end - cigar_operation.query_from
                    first_read_position = cigar_operation.ref_iv.start + distance - 1
                    first_read_strand = '-'
                if ((cigar_operation.query_from <= 76 + start_threshold) and
                        (81 + start_threshold <= cigar_operation.query_to)):
                    second_read_cigar = cigar_operation
                    second_read_chr = cigar_operation.ref_iv.chrom
                    second_end = max(76, cigar_operation.query_from)
                    distance = second_end - cigar_operation.query_from
                    second_read_position = cigar_operation.ref_iv.start + distance
                    second_read_strand = '+'