LieberInstitute / goesHyde_mdd_rnaseq

Fernando Goes and Thomas Hyde MDD RNA-seq project
1 stars 0 forks source link

Create code for matching and subsetting a new cov_rse degradation data. #1

Closed lcolladotor closed 4 years ago

lcolladotor commented 4 years ago

Hi,

The goal here is to practice the GitHub management tools, using sgejobs, and start editing some of our analysis scripts.

The task is to create a new R script that will create a new cov_rse object for MDD + BPD. The current object lives at data/degradation_rse_MDDseq_BiPSeq_BothRegions.Rdata and is used in several scripts such as in https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/blob/6a722bbfea2041dbb46ac2a66b95c0b8268ba582/wgcna/run_wgcna_combined.R#L17.

It currently has the same metadata columns (colData()) as the rse_gene filtered object (lowly-expressed genes were removed) as tested in https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/blob/6a722bbfea2041dbb46ac2a66b95c0b8268ba582/wgcna/run_wgcna_combined.R#L20. With the dropped_flagged_samples.R script we are in the process of re-making the rse_gene objects https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/blob/d12acf804069463b829c8cf9ca3973976580a906/data/drop_flagged_samples.R#L115. We haven't finished that script since we need to add the SNP genotype data (snpPC1, snpPC2, etc) to it, finalize which samples we'll use, then re-compute which genes (and other features) we will filter due to low expression values. But once we do, we need to have a matching cov_rse object.

So the goal here is to create an R script with it's companion bash script (use sgejobs to make this bash script) that will create a new cov_rse file that is subsetted to the same samples as the rse_gene object from https://github.com/LieberInstitute/goesHyde_mdd_rnaseq/blob/d12acf804069463b829c8cf9ca3973976580a906/data/drop_flagged_samples.R#L115 and has the same phenotype information.

Best, Leo

lcolladotor commented 4 years ago

In your commit messages, you can link them automatically to this GitHub issue (and even close the issue) using some keywords. Check https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue for more info.

lcolladotor commented 4 years ago

As a name & JHPCE location suggestion, use data/drop_flagged_samples_cov_rse.R and data/drop_flagged_samples_cov_rse.sh for these two scripts.