sambofra / bnstruct

R package for Bayesian Network Structure Learning
GNU General Public License v3.0
17 stars 11 forks source link

Unable to complete asia 2 layers example #11

Closed caseyzipfel closed 5 years ago

caseyzipfel commented 5 years ago

Hello! I would like to use the bnstruct package to learn the network structure of a Dynamic Bayesian Network from time series data. I have been trying to understand and replicate the examples from https://cran.r-project.org/web/packages/bnstruct/vignettes/bnstruct.pdf. I am unable to get the example under Learning Dynamic Bayesian Networks using the asia 2 layer data to work. I downloaded the data from this Github page and attempted to run the code: dataset <- BNDataset("asia_2_layers.data", "asia_2_layers.header", num.time.steps = 2) I receive the following error: Error in read.dataset(dataset, data, discreteness, ...) : Incoherent number of variables in the dataset header. In addition: Warning messages: 1: In Ops.factor(left, right) : ‘+’ not meaningful for factors 2: In Ops.factor(left, right) : ‘+’ not meaningful for factors 3: In Ops.factor(left, right) : ‘+’ not meaningful for factors

Could you please advise me on this issue? Thank you!

albertofranzin commented 5 years ago

Hi Casey,

there is an error in that example in the vignette, the data in those files starts from 0 so the command to use is

dataset <- BNDataset("path/to/asia_2_layers.data","path/to/asia_2_layers.header", num.time.steps = 2, starts.from = 0)

(I don't think I can send to the CRAN an update only for this)

caseyzipfel commented 5 years ago

Thank you! I got it working. I have a second question though: When doing learn.dynamic.network, I am getting the error: Error in cut.default(data[, i], quantiles, labels = FALSE, include.lowest = TRUE) : invalid number of intervals What data goes into data[,i] in the cut function? Is it cutting the different time points of each variable? Or all of the data? Or set of observations? Any help would be much appreciated! Thank you!

albertofranzin commented 5 years ago

That's an error that might show up when discretizing a continuous variable X. It means that you want to split the range of values observed for X in a way that makes it impossible for the cut to find meaningful cutting points (e.g. you get the same value more than once, resulting in an empty interval).

Every variable (column) is treated independently.

There was a bug that should be solved, try checking your data to see whether there are "strange things", for example a variable always taking the same value (it was the case of this issue).

In case the data is ok and you still have this issue, please give me more details, which commands are you using, paste some data (if possible), etc.