We need to create two versions of exome filter BED files (exon-only and gene-only), add them to our data dependencies, and include a script to generate these files in the util directory. This will provide more flexibility in filtering options for our pipeline users.
Proposed Changes
Create two BED files:
Exon-only filter
Gene-only filter (including both intron and exon)
Add these files to our data dependencies
Create a script to generate these files and add it to the util directory
Update documentation to reflect these new filtering options
Implementation Details
1. Create BED files
We need to create two BED files for each reference genome (hg19 and hg38):
exon_only_hg19.bed and exon_only_hg38.bed
gene_only_hg19.bed and gene_only_hg38.bed
These files should follow the standard BED format:
chromosome start end [name] [score] [strand]
2. Add files to data dependencies
Add the following files to the data dependencies:
data_dependencies/
├── ref_exonic_filter_bed/ # the directory might not be the right one, please check
│ ├── hg19/
│ │ ├── exon_only.bed
│ │ └── gene_only.bed
│ └── hg38/
│ ├── exon_only.bed
│ └── gene_only.bed
Description
We need to create two versions of exome filter BED files (exon-only and gene-only), add them to our data dependencies, and include a script to generate these files in the
util
directory. This will provide more flexibility in filtering options for our pipeline users.Proposed Changes
util
directoryImplementation Details
1. Create BED files
We need to create two BED files for each reference genome (hg19 and hg38):
exon_only_hg19.bed
andexon_only_hg38.bed
gene_only_hg19.bed
andgene_only_hg38.bed
These files should follow the standard BED format:
2. Add files to data dependencies
Add the following files to the data dependencies:
Update the AWS S3 bucket with these new files:
3. Create generation script
Create a script named
generate_exome_filters.py
(orgenerate_exome_filters.R
) in theutil
directory with the following structure:4. Update documentation
Update the README and relevant documentation to include information about the new filtering options and how to use them in the pipeline.
Tasks
--help
and documentation to reflect the new filtering optionsAdditional Notes