YangLab / CIRCexplorer2

circular RNA analysis toolset
http://circexplorer2.readthedocs.org/
Other
76 stars 41 forks source link

Fail to annotate MapSplice junctions #46

Closed BarryDigby closed 3 years ago

BarryDigby commented 3 years ago

Hello @kepbod, sorry to trouble you once again.

My issue should be evident from the title so I will provide you with as much information as possible, and hopefully you might be able to shed some light as to why circexplorer produces 0 annotations for MapSplice output.

The reference files used for this analysis are the latest GRCh38 gencode files, which you helped produce in issue #43.

MapSplice call:

 gzip -d --force ${mapsplice_reads[0]}
 gzip -d --force ${mapsplice_reads[1]}

 mapsplice.py -c $genome_dir -x $bowtie_dir -1 ${base}_R1.fastq -2 ${base}_R2.fastq \
  -p 8 --bam --seglen 20 --min-map-len 40 --fusion-non-canonical --min-fusion-distance 200 \
  --gene-gtf $gtf -o $base 

CIRCexplorer2 parse call:

CIRCexplorer2 parse -t MapSplice $fusion -b ${base}.mapsplice.junction.bed

CIRCexplorer2 annotate call:

CIRCexplorer2 annotate -r $ref_txt -g $genome -b ${base}.mapsplice.junction.bed -o ${base}.mapsplice.circRNA.txt
Start CIRCexplorer2 annotate at 16:34:52
Start to annotate fusion junctions...
Annotated 0 fusion junctions!
Start to fix fusion junctions...
Fixed 0 fusion junctions!
End CIRCexplorer2 annotate at 16:34:59

$fusion file -> fusions_raw.txt

${base}.mapslice.junction.bed file (converted to .txt to upload) -> mapslice.junction.txt

The junctions do overlap annotations in the hg38.txt file, so I am not sure why none are being called. I can provide the reference files if needed but they were produced verbatim from #43.

Kind Regards,

Barry

kepbod commented 3 years ago

Hi Barry,

I checked your command lines and found no errors. The genomic span in the junction file (mapslice.junction.txt) are too long and these junctions seem like artifacts but not circRNAs. One potential reason is about the aligner. You could try other aligner (such as STAR) to see whether you could get circRNAs. Another potential reason is there is no circRNA in your RNA-seq samples. Total RNA-seq or RNase R enrichment could help to enrich circRNAs.

Xiao-Ou

BarryDigby commented 3 years ago

Thank you for the prompt response

BirongZhang commented 2 years ago

Hi all,

Sorry to comment below the closed issue, but I ran into the same problem. I used STAR aligner, everything is fine before annotation.

Here is my STAR command and junction output(Homo_sapiens.GRCh38.103.gtf):

STAR \
--runMode alignReads \
--genomeDir $genome \
--runThreadN $cores \
--readFilesIn $fq \
--sjdbGTFfile $gtf  \
--outFileNamePrefix $align_out \
--outSAMtype BAM SortedByCoordinate \
--chimSegmentMin 10 \
--chimOutType Junctions \
--chimOutJunctionFormat 0
Screenshot 2021-10-13 at 18 28 31

Here is my parse command and output:

 CIRCexplorer2 parse -t STAR Chimeric/SRR13199991_Chimeric.out.junction > CIRCexplorer2_parse.log
Screenshot 2021-10-13 at 18 30 06

Here is my annotation command and output:

CIRCexplorer2 annotate -r hg19_ref_all.txt -g hg19.fa -b back_spliced_junction.bed -o circularRNA_known.txt > CIRCexplorer2_annotate.log
Screenshot 2021-10-13 at 18 32 00

Then I got a zero byte circularRNA_known.txt file. So could you help me to have a look at my command? If my command is correct, then as you said before, this data won't find any circular RNA. Thanks.

Kind regards, Birong