broadinstitute / gatk

Official code repository for GATK versions 4 and up
https://software.broadinstitute.org/gatk
Other
1.7k stars 591 forks source link

Create tool for producing genomic regions (as a BED file) #7159

Open LeeTL1220 opened 3 years ago

LeeTL1220 commented 3 years ago

Feature request

Tool(s) or class(es) involved

This is a request for a new tool GencodeRegionsAsBED

Description

Given a GENCODE gtf, create a BED file with the region of the genes. Each row is a gene.

Suggestion: This can be implemented as a FeatureWalker<GencodeGtfFeature>

Requirements

[P0] = "Must have. Cannot close this issue without this feature or without filing another issue. This tool is not considered complete without this feature." [P2] = "Not required. This tool can be considered complete without this feature. No need to ask permission to drop it. If it is NOT delivered, please mention what P2's were not delivered in the closing comment of this issue."

Example output

BED is tab-delimited...

...
chr22   21759657    21867680    MAPK1
...

With transcript option:

...
chr22   21759657    21867645    MAPK1,ENST00000215832.11
chr22   21769040    21867680    MAPK1,ENST00000398822.7
chr22   21769204    21867440    MAPK1,ENST00000544786.1
...

Note: The union of the transcript regions is reported when the transcript option is not present.

kockan commented 2 months ago

Resolved by #8942 . Unless @LeeTL1220 or @droazen has any objections I will close this with the note that all P0 requirements are met by the relevant PR. For the P2 requirements, the following are not included as of now but if they are deemed important I can keep this open:

Thanks to @sanashah007 for all the work!