mhahsler / pomdp

R package for Partially Observable Markov Decision Processes
16 stars 5 forks source link

Issue after updating to 1.0.0-1 #16

Closed emile-pelletier-gc closed 2 years ago

emile-pelletier-gc commented 2 years ago

Hello Michael,

I just updated my previous pomdp 0.99.3 to the new 1.0.0-1 and a r shiny dashboard tool that I made now throws this error:

Output created: C:/Users/Emile/AppData/Local/Temp/RtmpKs7Zol/file12614629879f5/xxxx.html Warning: Error in sprintf: invalid format '%.7f'; use format %s for character objects 3: 1: rmarkdown::run

Not sure if there is anything that can be suggested. Is this an issue that can be pursued?

Thanks so much.

mhahsler commented 2 years ago

Hi, I rewrote some of the POMDP file handling code and it seems like you have a special case with specifying something as vector of strings instead of numbers (I guess the initial belief state). Can you share the POMDP definition/file with me so I can debug and fix the issue?

emile-pelletier-gc commented 2 years ago

Hi Michael,

Yes I found the source of the problem. If you would try the following defined POMDP with the old 0.99.3 it works fine.

But it will not work with 1.0.0-1


require(pomdp)

falsepos = 0.025

states = c("sm1", "sm2", "sm3", "sm4", "goal") 

initialstate = c(1/4, 1/4, 1/4, 1/4)

actions = c("m_sm1_sm2", "m_sm2_sm1", "m_sm2_sm3", 
            "m_sm3_sm2", "m_sm3_sm4", "m_sm4_sm3", "tag")

obsfunction <- NULL

obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[2], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[2], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[1], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[1], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[3], end.state = states[3], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[3], end.state = states[3], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[4], end.state = states[2], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[4], end.state = states[2], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[5], end.state = states[4], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[5], end.state = states[4], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[6], end.state = states[3], observation = "no", probability = falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[6], end.state = states[3], observation = "objective_met", probability = 1 - falsepos))

obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[3], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[3], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[3], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[4], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[4], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[4], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[5], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[5], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[5], end.state = states[3], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[6], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[6], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[6], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[1], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[2], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[7], end.state = states[5], observation = "objective_met", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = actions[7], end.state = states[1], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[7], end.state = states[2], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[7], end.state = states[3], observation = "no", probability = 1))
obsfunction <- rbind(obsfunction, O_(action = actions[7], end.state = states[4], observation = "no", probability = 1))

obsfunction <- rbind(obsfunction, O_(action = "*", end.state = states[5], observation = "objective_met", probability = 1))

transitionfunction <- NULL

transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[1], end.state = states[2], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[2], end.state = states[1], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[2], end.state = states[3], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[3], end.state = states[2], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[3], end.state = states[4], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[4], end.state = states[3], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[4], end.state = states[5], probability = 1))

transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[5], end.state = states[2], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[5], end.state = states[1], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[5], end.state = states[3], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[5], end.state = states[2], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[5], end.state = states[4], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[5], end.state = states[3], probability = 1))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[5], end.state = states[5], probability = 1))

transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[3], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[3], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[3], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[3], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[2], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[2], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[2], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[2], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[7], start.state = states[1], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[2], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[2], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[2], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[2], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[3], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[3], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[3], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[3], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[4], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[4], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[4], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[1], start.state = states[4], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[1], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[3], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[3], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[3], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[3], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[4], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[4], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[4], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[2], start.state = states[4], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[1], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[3], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[3], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[3], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[3], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[4], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[4], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[4], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[3], start.state = states[4], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[1], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[2], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[2], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[2], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[2], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[4], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[4], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[4], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[4], start.state = states[4], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[1], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[2], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[2], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[2], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[2], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[4], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[4], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[4], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[5], start.state = states[4], end.state = states[4], probability = 0.25))

transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[1], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[1], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[1], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[1], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[2], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[2], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[2], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[2], end.state = states[4], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[3], end.state = states[1], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[3], end.state = states[2], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[3], end.state = states[3], probability = 0.25))
transitionfunction <- rbind(transitionfunction, T_(action = actions[6], start.state = states[3], end.state = states[4], probability = 0.25))

rwdfunction <- NULL

rwdfunction <- rbind(rwdfunction, R_(action = actions[7], start.state = states[4], "*", "*", v = 5))

rwdfunction <- rbind(rwdfunction, R_(action = actions[7], start.state = states[5], "*", "*", v = 0))

rwdfunction <- rbind(rwdfunction, R_(action = actions[7], start.state = states[1], "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[7], start.state = states[2], "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[7], start.state = states[3], "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[1], start.state = "*", "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[2], start.state = "*", "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[3], start.state = "*", "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[4], start.state = "*", "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[5], start.state = "*", "*", "*", v = -1))

rwdfunction <- rbind(rwdfunction, R_(action = actions[6], start.state = "*", "*", "*", v = -1))

DefinedPOMDP_dfs <- POMDP(

  name = "POMDP defined with df",

  discount = 0.95,

  states = c("sm1", "sm2", "sm3", "sm4", "goal"), 

  start = c(initialstate, 0),

  actions = actions,

  observations = c("objective_met", "no"), 

  transition_prob = transitionfunction,

  observation_prob = obsfunction,

  reward = rwdfunction
) 
mhahsler commented 2 years ago

Thank you! I have isolated the bug and will fix it shortly.

mhahsler commented 2 years ago

The fix is now on GitHub and also on the wat to CRAN.