genetics-of-dna-methylation-consortium / godmc_phase2

This repository contains the code to run the analysis pipeline for phase 2 of goDMC released June 2024.
GNU General Public License v3.0
2 stars 1 forks source link

check the outputs for cellcount and age prediction #21

Closed SiyiSEA closed 1 month ago

SiyiSEA commented 2 months ago

This pull request is for:

  1. Enhance the function of check_upload.sh 03a check to handle the situation when cellcounts_required="yes" and measured_cellcounts="NULL" have been set in the config file;
  2. Check all the outputs for 03a.
SiyiSEA commented 2 months ago

If we did force the cellcounts_required = "yes",

Would that be more safe for us to delete the parameter of cellcounts_required in the config file? And update the wiki description about the cell count? https://github.com/genetics-of-dna-methylation-consortium/godmc_phase2/wiki/Install-and-set-up#cell-count-data

I could do them before the merging.

ejh243 commented 2 months ago

Currently it seems like recalculation of cell counts is not enforced, i.e. if cellcounts_requiredis set to no, this will be skipped. Should this be the case @epzjlm ?

ejh243 commented 2 months ago

Looking at the wiki I wonder if this is because sorted datasets don't need cell composition caluclated? But is this is the case can't we use the sorted_methylation config variable to turn this off? Otherwise we might get whole blood datasets not running this.

epzjlm commented 2 months ago

It might be easer to add a check to script 01 or script 03

ejh243 commented 1 month ago

I have removed the cellcounts_required variable and instead made better use of the sorted_methylation and measured_cellcounts variables. IMO even for sorted datasets the cell composition variables generated from the data are useful QC things to have. @SiyiSEA could you review my edits carefully in case it doesn't make sense.

ejh243 commented 1 month ago

If both @SiyiSEA and @epzjlm are then happy we can merge this change and I will remove reference to the cellcounts_required variable from the wiki.