sambofra / bnstruct

R package for Bayesian Network Structure Learning
GNU General Public License v3.0
17 stars 11 forks source link

Static Variables #19

Closed AnaR-Martins closed 4 years ago

AnaR-Martins commented 4 years ago

Hi,

I would like to ask if there is any way to handle static variables with bnstruct, without having to repeat them in each time slice. For example, for a clinical trial, I want to consider as a variable the sex of each patient.

Thanks in advance!

albertofranzin commented 4 years ago

Hi,

there is no built-in way of including them only once and, at the same time, considering them static throughout the various layers. You have to manually define all the time steps, and include these static variables in the first layer (which will have more variables than the following ones).

AnaR-Martins commented 4 years ago

So when I learn the network I shouldn't use the function learn.dynamic.network but instead I should use the function learn.network and define the layers manually, is that it?

albertofranzin commented 4 years ago

Yes, exactly.

AnaR-Martins commented 4 years ago

Okay, thank you! I was trying to do that, however, everytime I try to run the code, the R session aborts, saying that R encountered fatal error. Do you have any idea what it can be? The code I am trying to run is the following: datasetTSimples1sI <- BNDataset(data=TSimples1sI, discreteness=c("d","c","d","c","c","c","d","c","c","c","d","c","c","c"), variables=c("HLAB28","IMC","eva_doente_t0","pcr_t0","basfi1_t0","basdai_q1_t0","eva_doente_t1","pcr_t1","basfi1_t1","basdai_q1_t1","eva_doente_t2","pcr_t2","basfi1_t2","basdai_q1_t2"), node.sizes=c(2,4,101,10,10,10,101,10,10,10,101,10,10,10), starts.from=0) which works, however delivers the warning messages: 1: In validityMethod(object) : Not all of the possible values have been observed for variables 3 7 11 2: In validityMethod(object) : Not all of the possible values have been observed for variables 3 7 11 3: In validityMethod(object) : Not all of the possible values have been observed for variables 3 7 11 4: In validityMethod(object) : Not all of the possible values have been observed for variables 3 7 11 The previous code is followed by: layers_TSimples1sI <- c(1,1,2,2,2,2,3,3,3,3,4,4,4,4) and dbn_Testes.Simples1sI <- learn.network(datasetTSimples1sI,layering=layers_TSimples1sI) which is the line of code that results in the termination of the R session.

Thank you in advance

albertofranzin commented 4 years ago

I guess it happens because you have too many possible values for each variable, and the R session is trying to allocate too much memory for the conditional probability tables. You have variables that take 101 values: as a rough upper bound, even with the layering the algorithm might try to allocate a block of memory of size 101 * (101 * 10 * 10 * 10) ^ 2 * 2 * 4 * sizeof(int), which is in the order of the terabytes (in the RAM).

Imposing a limit on the number of parent nodes is probably not going to help too much, you have to reduce the number of values for each variable.

The warnings, instead, have no consequence at all.

AnaR-Martins commented 4 years ago

I changed the 101 for 11 and it worked! Thank you!!