junchaoshi / sports1.1

Small non-coding RNA annotation Pipeline Optimized for rRNA- and tRNA-Derived Small RNAs
GNU General Public License v3.0
45 stars 16 forks source link

unmatch genome part #23

Closed daixiaozhuan closed 2 years ago

daixiaozhuan commented 3 years ago

Hi Junchao,

I have a question here: when you map to rRNA_5S (your script below), you have two steps: match genome and unmatch genome. But the bowtie index for rRNA 5S for the two steps is the same one. I don't understand why you map the unmatch genome reads to rRNA 5S again.

For my understanding, when the reads can't map to human genome, it can't map to human small RNA databases either.

That would be awesome if you can explain this part to me. Thanks.

''' name=rRNA_5S bowtie_address=/storeData/project/user/daixiaozhuan/reference/SPORTS1.0_smallRNAdb/Homo_sapiens/rRNAdb/human_rRNA_5S

match genome part

echo "" echo "match to ${name}-match_genome" output_match_match_genome=${output_address}${input_query_name}match${name}_match_genome.fa output_unmatch_match_genome=${output_address}${input_query_name}unmatch${name}_match_genome.fa touch ${output_match_match_genome} touch ${output_unmatch_match_genome}

bowtie ${bowtie_address} -f ${input_match} -v ${mismatch} -a -p ${thread} --fullref --norc --al ${output_match_match_genome} --un ${output_unmatch_match_genome} >> ${output_detail_match_genome}

unmatch genome part

echo "" echo "match to ${name}-unmatch_genome" output_match_unmatch_genome=${output_address}${input_query_name}match${name}_unmatch_genome.fa output_unmatch_unmatch_genome=${output_address}${input_query_name}unmatch${name}_ummatch_genome.fa touch ${output_match_unmatch_genome} touch ${output_unmatch_unmatch_genome}

bowtie ${bowtie_address} -f ${input_unmatch} -v ${mismatch} -a -p ${thread} --fullref --norc --al ${output_match_unmatch_genome} --un ${output_unmatch_unmatch_genome} >> ${output_detail_unmatch_genome} '''

junchaoshi commented 2 years ago

"when the reads can't map to human genome, it can't map to human small RNA databases either." This statement might be true for human genome, but not for the genomes of other species which are not well assembled.