LiuzLab / AI_MARRVEL

AI-MARRVEL (AIM) is an AI system for rare genetic disorder diagnosis
GNU General Public License v3.0
8 stars 5 forks source link

Add Exonic Filters in Proc.sh #42

Closed jylee-bcm closed 1 month ago

jylee-bcm commented 2 months ago

NOTE: This PR requires an update to the data dependencies directory.

NOTE: The Docker image must be rebuilt and redeployed to Dockerhub.

Description

We've received reports that uploading Whole Genome Sequence (WGS) data can take several weeks and may fail. To address this, we've implemented an Exonic Filter, developed by @hyunhwan-bcm, as detailed in this gist.

This filter is automatically applied when the number of input variants exceeds 100,000 after mitochondria filtering. For example, applying the filter to WGS files containing approximately 4.8 million variants reduces the size to around 182,000 (a 3.8% reduction).

Updated Data Dependencies

The updated data dependencies can be found in the S3 bucket at aws s3 ls s3://aim-data-dependencies-2.0/filter_exonic/.

hyunhwan-bcm commented 1 month ago

will be merged with nextflow_conversion