Clinical-Genomics / chanjo

Chanjo provides a better way to analyze coverage data in clinical sequencing.
https://clinical-genomics.github.io/chanjo/
MIT License
50 stars 12 forks source link

Use GTF/GFF instead of CCDS bed? #175

Open biocyberman opened 8 years ago

biocyberman commented 8 years ago

I have been detached from work that uses chanjo for a while I get outdated when I come back now. It is indeed a good think to see how chanjo quickly develops. I feel the urge of checking coverage for gene/exons/transcripts contained in UCSC's or Ensembl's GTF file. I also know that there are tools (i.e. bedops's gtf2bed) that can covert GTF to BED easily. So the real questions are:

(Q1) How do I use arbitrary BED file from any species or genome build that I use for making my BAM file? What is the workflow? I could not find it in the documentation so I want to make a quick check here. If it is currently not possible, I would like to make it a feature request.

As far as I can see, the benefits of using GTF over CCDS are:

  1. GTF is more uptodate than CCDS, so it offer newer annotation information.
  2. GTF is free from the constrain of mouse-human relation.
  3. it opens up the freedom of using chanjo for all available genomes.

(Q2) Therefore I am curious about the rationale of choosing CCDS over GTF when developing chanjo. I think it has something to do with clinical use in mind.

biocyberman commented 8 years ago

Question Q1 is answered here if what it is telling is correct https://github.com/robinandeer/chanjo/issues/166#issuecomment-190934307

I am still interested in Q2.

robinandeer commented 8 years ago

Q2) CCDS was only a choice I went with as an example bc it worked for us internally who are dealing with rare inherited disorders.

Since then it's grown into the standard starting point in chanjo but this really doesn't have to be the case. I had a project in the beginning to provide converts from popular sources but scraped it eventually.

I can look into GTF although it might contain too many elements which are not of clear clinical interest for us