deeptools / deepTools

Tools to process and analyze deep sequencing data.
Other
673 stars 208 forks source link

isoform reduction #991

Closed mictadlo closed 4 years ago

mictadlo commented 4 years ago

Hi, I used the Scallop results with TransDecoder. Next, I used bamCoverage to create RNA-Seq profile. I noticed that not always the first isoform is the best one compared to the RNA-Seq profile as could be seen below a few examples:

1. Screen Shot 2020-08-22 at 4 35 26 PM

  1. Screen Shot 2020-08-22 at 4 38 45 PM

  2. Screen Shot 2020-08-22 at 4 51 37 PM

  3. Screen Shot 2020-08-22 at 4 58 42 PM

  4. Screen Shot 2020-08-22 at 5 04 34 PM

By any chance, is there a way to keep only the best isoform?

Thank you in advance,

Michal

LeilyR commented 4 years ago

I am not sure if I can get your point from the figures you have sent, is the isoform in the red box is the best isoform? You get indeed the coverage over all isoforms if you have them all in your bam file. Would not filtering your bam file helping you to get the coverage of the only isoform that you are interested in?

mictadlo commented 4 years ago

Hi, Yes, the isoform in the red box is the best isoform for us. Do you have any suggestions on how it would be possible to filter the BAM file?

Thank you in advance,

Michal

LeilyR commented 4 years ago

what about bedtools intersect?

mictadlo commented 4 years ago

I looked at bedtools intersect. By any chance, do you know how to generate the BED file?

Thank you in advance,

Michal

LeilyR commented 4 years ago

Well my idea was to just look at the intersect of your bam file with a bed file where you already have the coordinates of your isoform of interest. You might need be careful on how to do it, to only get that specific isoform since your isorforms are overlapped. Check the offered parameters by bedtools and maybe even end up using it twice to exclude the other isoforms, I would say. If you really want to know how to make bed file, they are just tsv files with certain columns and you can check this link for more info about it.

mictadlo commented 4 years ago

We have around 50.000 expressed genes, therefore I thought to do it automatically rather than manually go through all the genes.

LeilyR commented 4 years ago

I see, you sure that the tool you used to detect these genes don't have an output of bed format? Are you having gtf/gff files and want to convert them to bed files? In that case you can check tools that do this conversion for you such as BEDOPS or gffutils.

mictadlo commented 4 years ago

I checked Scallop and TransDecoder and it appears that they don't output BED files. Scallop provides GTF but TransDecoder output a GFF3 file. When I convert GFF3/GTF file to BED then what would be the next step?

Thank you in advance,

Michal

LeilyR commented 4 years ago

then you can use bedtools to look for the intersect of the regions of interest and the alignment file

LeilyR commented 4 years ago

I hope we could help you answering your question. I close this issue for now but feel free to get back to us if there was any question left.