Open RJWANGbioinfo opened 4 years ago
Dear RJWANGbioinfo,
Thank you for using COMPSRA and writing response to me. The miRNA hsa-let-7a-2 is a pre-miRNA and it can form hsa-let-7a-5p and hsa-let-7a-2-3p. For pre-miRNAs, COMPSRA will only show the unique read count, which means that for all the reads mapped to the pre-miRNA region, they will remove the part of reads that can map to the mature miRNAs. In brief, you can think that N(pre-miRNA)=N(total)-N(mature miRNA).
Best Wishes, Jiang
Hi @cougarlj ,
Thanks for your prompt response! I just tried to extract the unique mapped reads from the bam using:
samtools view -b -q 255 sample01_17to50_FitRead_STAR_Aligned_miRNA.bam > unique.bam
And there are 0 reads found in the miRNA hsa-let-7a-2. Could you help me to figure out how to extract the consistent unique count of this miRNA from the bam output of COMPSRA?
And if possible, could you help me to understand the setting of '-aol'? If my understanding is correct, it sets the overlap percentage between reads and miRNA, 1.0 mean 100% overlapped. While it seems COMPSRA will still generate a result even if I set this value >1.0 (e.g., 1.5), why is that?
Thank you!
Dear @RJWANGbioinfo ,
Sorry to reply late. The bam file from COMPSRA doesn't store annotation information, so you don't know which read belongs to a certain miRNA. COMPSRA takes STAR as the aligner and I remember that you can provide an annotation file when you align reads to the genome. If that so, maybe you can extract the target reads and count it. Moreover, I remember COMPSRA can directly annotate pre-miRNA when you use -ac 11 after -ann. Also, COMPSRA has a toolkit that contains some easy functions. One of them is to extract reads with certain annotations. You can try the example: $COMPSRA -tk -sa chr21|16539828|16539911|+|hsa-let-7c|miRBase;chr21|16539838|16539859|+|hsa-let-7c-5p|miRBase -sam /your/output/source_test.sam -fq /your/target/output/target_test.fastq .
Your understanding about is '-aol' is right and when the value is greater than 1, COMPSRA will set it as 1.
Hope this can help you.
Best Wishes, Jiang Li
Hi @cougarlj
Thanks for your response. I tried the example command you mentioned as:
$ java -jar COMPSRA.jar -tk -sa chr21|16539828|16539911|+|hsa-let-7c|miRBase;chr21|16539838|16539859|+|hsa-let-7c-5p|miRBase -sam example_out/sample01/sample01_17to50_FitRead_STAR_Aligned.out.bam -fq example_out/sample01/sample01_17to50_FitRead.fastq.gz .
It gives me the below errors: bash: +: command not found
bash: hsa-let-7c: command not found
bash: miRBase: command not found
bash: 16539911: command not found
bash: 16539828: command not found
bash: +: command not found
bash: hsa-let-7c-5p: command not found
bash: miRBase: command not found
bash: 16539859: command not found
bash: 16539838: command not found
bash: chr21: command not found
Hi, I'm using the COMPASS to run some testing analysis. One issue makes me confused is the inconsistent count between the bam file and the miRNA count table. The command I used is:
java -jar COMPSRA.jar -ref hg38 -ann -ac 1 --ann_bam -in ./example_out/sample01/sample01_17to50_FitRead_STAR_Aligned.out.bam -out ./example_out/sample01/sample01_17to50_FitRead_STAR_Aligned
When I compare the count of miRNA from "sample01_17to50_FitRead_STAR_Aligned_miRNA.bam" versus the one in "sample01_17to50_FitRead_STAR_Aligned_miRNA.txt", it seems some of them not matched well, for instance, there are 34 reads mapped to hsa-let-7a-2 (chr11: 122146522-122146593) in bam file, while the count in the .txt file is only 1.
Could you help to figure out how to explain this? Thank you!