mhahsler / pomdp

R package for Partially Observable Markov Decision Processes
16 stars 5 forks source link

Columns and rows of the outputs need to be reordered in function transition_matrix #5

Closed blake95 closed 4 years ago

blake95 commented 4 years ago

First of all, thank you for this amazing package!

When I tried to extract a transition probability matrix from the specified model using transition_matrix function in the package, even though the order of column and row names match the pre-specified state names, the values don't match the order of state names. I think you need to add [states,states] after spreading the transition probability dataframe to reorder the elements of the transition probability matrix.

Thanks again!

mhahsler commented 4 years ago

Hi,

please send us an example that shows the problem so we can find the bug.

Thanks, Michael

blake95 commented 4 years ago

Sure thing! I uploaded a pomdp file. You can compare the original transition probability dataframe in the file and the transition matrix extracted by transition_matrix. Thanks!

mhahsler commented 4 years ago

Please upload the r code that creates the POMDP model. The reason is that read_POMDP() does not parse the transition matrix in the file and thus I cannot use transition_matrix to check for problems.

blake95 commented 4 years ago

Of course. I also created 2 correct matrices, observation_mx and transition_mx for your reference. The reward matrix extracted by reward_matrix looks good for my case. Thank you!

mhahsler commented 4 years ago

I think your code to create the matrices might be wrong. I added the following to you code:

r0 <- matrix(0, nrow = length(all_state), ncol = length(all_state), 
  dimnames = list(all_state, all_state))
for(i in seq_len(nrow(transition_prob))) {
  if(transition_prob[i, "action"] == "r0")
    r0[transition_prob[i, "start.state"], transition_prob[i, "end.state"]] <- transition_prob[i, "probability"]
}

identical(r0, transition_matrix(m)$r0)

Result:

[1] TRUE

Please check.

blake95 commented 4 years ago

Hi, your code is absolutely correct, but I think you need to convert columns "start.state" and "end.state" back to characters first. transition_prob$start.state<-as.character(transition_prob$start.state) transition_prob$end.state<-as.character(transition_prob$end.state) r0 <- matrix(0, nrow = length(all_state), ncol = length(all_state), dimnames = list(all_state, all_state)) for(i in seq_len(nrow(transition_prob))) { if(transition_prob[i, "action"] == "r0") r0[transition_prob[i, "start.state"], transition_prob[i, "end.state"]] <- transition_prob[i, "probability"] }

identical(r0, transition_matrix(m)$r0)

Which returns false.

I think the helper function T_ converted those two columns into factors, which caused the mismatch. Thanks!

mhahsler commented 4 years ago

I think you are right. I am currently updating the package for R 4.0 which does not automatically convert characters to factor when working with data.frames. I think this issue is already resolved in the pomdp version on GitHub.