BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
208 stars 71 forks source link

Collapse creating combined isoforms #257

Closed eschiks closed 1 year ago

eschiks commented 1 year ago

Hello,

I am running into an issue with the collapse module with Nanopore DRS data where for many genes reads are being assigned to an isoform that combines an upstream and downstream gene (see screenshot attached for an example). In this case there are a few reads (maybe ~10) that do span the junction between these genes, but thousands of reads that do not. However, all reads are being assigned to the fusion gene. I have tried many parameters to remedy this, but nothing seems to do the trick. When I look at the raw fastq reads they clearly end around the annotated TES.

The latest iteration of the command I've tried is here:

python ~/flair-1.7/flair.py collapse --stringent --isoformtss --no_gtf_end_adjustment --trust_ends --keep_intermediate --support 5 -t 16 -g ${genome_dir}/c_elegans.PRJNA13758.WS279.genomic.fa -f ${genome_dir}/c_elegans.PRJNA13758.WS279.canonical_geneset.gtf -r ${fastq_dir}/all_reps_tss_polya.fastq -q ${correct_dir}/all_corrected.bed --temp_dir ./tmp --generate_map -o collapse_new

I have also tried it without the stringent, isoformtss, no_gtf_end_adjustment and trust_ends flags, but run into the same issue for several genes.

Please let me know if there's something I'm missing or if you have any other suggestions.

-Erin

Screen Shot 2023-04-26 at 10 31 12 AM

eschiks commented 1 year ago

This issue was solved by using --annotation_reliant (with generate), removing --trust_ends and adding --check_splice flags