Closed ShanSabri closed 4 years ago
Could you try adding one more level for the DH cluster model (e.g., "DHCluterNum3 5000") and the required files (e.g, "DHClusterCoef3 DH_cluster5000_coef.txt" and "DHClusterPredictor3 DH_cluster5000_predictor.txt")? The current build library function requires three levels of DH clusters.
Yes, the model was built properly using 3 cluster levels. I'm curious to know why there are 3 clustering assignment requirements. How exactly does BIRD use this information?
BIRD is designed to combine DH cluster level prediction with locus level prediction based on our observation that DH cluster activities are easy to predict. If you would like to know the details about how BIRD combines these predictions, you can check out the Methods section "Model aggregation" in our first BIRD paper: Zhou W, Sherwood B, Ji Z, Xue Y, Du F, Bai J, Ying M, Ji H. Genome-wide Prediction of DNase I Hypersensitivity Using Gene Expression. Nature Communications 8, 1038 (2017). The three clustering assignment is an initial setting in the BIRD model which represent the typical levels of clusters for the DH data. It is hard-coded in the current version of the build model function. I will update the code to make this more flexible in a future version. Thanks for pointing this out.
Great, thank you for the clarification.
par_file.txt (all files exist)
Building library
There seems to be a hangup when trying to read
DH_cluster1000_predictor.txt
? This file does exist at the file path in the parameter file. The head of this file looks:Any thoughts?
EDIT: Digging into the source it seem as though BIRD is expecting 3 coef and corresponding predictor files. Is there a particular reason why?