nolanlab / citrus

Citrus Development Code
GNU General Public License v3.0
31 stars 20 forks source link

Handling missing values in a citruskey #83

Open davemcilwain opened 8 years ago

davemcilwain commented 8 years ago

I have a nearly complete set of fcs files with baseline + serial samples from multiple individuals. The only problem is that I am missing some samples/fcs files from some later timepoints. I would like to be able to do one large CITRUS run with all files and all timepoints compared to baseline. However, I get an error when my 'citruskey' file has missing files in the matrix.

Is there a way to force the program to ignore missing files and process what it has available for a given time-point?

THANK YOU

rbruggner commented 8 years ago

Unfortunately, none of the models that citrus uses to determine an association between cluster feature and endpoints know how to handle missing values.

If you are just interested in clustering those files together (with no endpoint association), you can achieve that using the Citrus R code directly.

If you want to still perform associations with the endpoint, but include the conditions with the missing values, i think you'll have to build the features yourself and chose another model (not built into Citrus) that supports missing values (like a regression tree or something similar).