gogetdata / ggd-recipes

conda recipes for genomic data
MIT License
85 stars 12 forks source link

Update GENCODE Canonical Transcript to remove duplicated genes caused… #182

Closed mikecormier closed 3 years ago

mikecormier commented 3 years ago

… by NMD and readthrough transcript annotations

GGD recipe review is required to merge a pull-request (PR). Once your PR is passing tests on CircleCI and is ready to be merged, add the `please review & merge` label to the PR. NOTE: If you are NOT a member of the gogetdata project (meaning that you can't add this label), add a comment requesting that the label be added. * [x] I have read the [guidelines for ggd data recipes](https://gogetdata.github.io/contribute.html). * [ ] This PR is for a new data recipe. * [x] This data recipe **is directly relevant to the biological sciences**. * [x] This PR updates an existing recipe. * [ ] This PR does something else (explain below). ## I'm submitting a new recipe for species: **Genome build for the recipe is:** GRCh38, GRCh37, hg38, hg19 ## Description of the data and what processing is going on GENCODE Canonical transcripts. Duplicated genes from GENCODE annotations such as `nonsense_mediated_decay` and `readthrough_transcript` are removed. **File type(s):** gtf **Data files containe a header:**