pettni / pdf-abstraction

Now developed in the mdp_network repository
1 stars 0 forks source link

Class for graph based backups #9

Closed shaesaert closed 6 years ago

shaesaert commented 6 years ago

Updates

  1. I created a class that includes the backup operations and the product of the specification DFA and the roadmap.
  2. Allow for inputs both as formal specifications (type: string) and as DFA
  3. Changed order of backups #5
  4. Added convergence criteria to the backups #5
  5. I added a discount (i.e., probability loss to observations)
  6. I added a way to plot the 1-D value functions at a given node

An example file jupiter is given in https://github.com/pettni/pdf-abstraction/blob/ScalableBest_class_change/Demo_file/Rocksample.ipynb in the latest commit https://github.com/pettni/pdf-abstraction/commit/b232d56ee88793c1233568f9a2196240bfdc54a1. This example is now more substantial than what was presented at the meeting last week.

1 Created class

The class builds on networkx and can because of that easily allow for structured backup schemes. See also point 3.

The class can be found at: https://github.com/pettni/pdf-abstraction/blob/b232d56ee88793c1233568f9a2196240bfdc54a1/best/hVI_fsrm.py#L459.

The class includes a backup operation https://github.com/pettni/pdf-abstraction/blob/b232d56ee88793c1233568f9a2196240bfdc54a1/best/hVI_fsrm.py#L614 for all important nodes in the product graph it computes the local backups via https://github.com/pettni/pdf-abstraction/blob/b232d56ee88793c1233568f9a2196240bfdc54a1/best/hVI_fsrm.py#L626

Remark that the function plot_results https://github.com/pettni/pdf-abstraction/blob/b232d56ee88793c1233568f9a2196240bfdc54a1/best/hVI_fsrm.py#L801 has not been finished yet. It has been pre-coded in the Jupiter file https://github.com/pettni/pdf-abstraction/blob/ScalableBest_class_change/Demo_file/Rocksample.ipynb.

2. Allow for inputs as formal specifications and as DFAs

Suppose that we would like to specify '!obs U (sample1 or sample2)', which is something we cannot do right now with the formula based engine, then we can specify this immeditaly as a DFA. That is,

props = ['obs', 'sample1', 'sample2']
props = dict(zip(props, map(lambda x: 2 ** x, range(0, len(props)))))
fsa = Fsa()
# add the nodes
fsa.g.add_node('0')
fsa.g.add_node('trap')
fsa.g.add_node('1')
# add the transitions
fsa.g.add_edge('0','1', weight=0, input {props['sample1'],props['sample2'],props['sample2']+props['sample1']})
fsa.g.add_edge('0','0', weight=0, input={0})
fsa.g.add_edge('0','trap', weight=0, input={props['obs']})

fsa.props=props # how propositions are translated to numbers
fsa.final = {'1'} # the final state
fsa.init = dict({'0':1}) #the intial state 

This is shipped as a dictionary:

formula_fsa = dict()
formula_fsa['fsa'] = fsa
formula_fsa['init'] = dict({'0':1})
formula_fsa['final'] = {'1'}
formula_fsa['prop'] = props

We then use this to creat the DFA X FIRM product

prod_ =spec_Spaths(firm, formula_fsa,env,n=125)

Here, firm is a roadmap object, formula_fsa is now a dict with the DFA info, env is the environment, and n is the number of samples of the belief points to use.

3. The backups are ordered

The backups are now ordered based on self.sequence of the product graph.

4. The convergence

The backups can now keep a dictionary with the old optimal values. Its use for quantifying convergence is given in the issue on convergence.

5. 1-D value functions at a given node

For multi-dimensional beliefspaces, the routine in https://github.com/pettni/pdf-abstraction/blob/ScalableBest_class_change/Demo_file/Rocksample.ipynb allows for the plotting of the value function projected onto one dimension. The is as follows

def plot_bel(n,i):
    print(prod_.env.b_reg_init.tolist())

    (q,v) =  n
    vals = []
    for b_i in np.linspace(0,1,20):
        b_ind = [ prob if not el == i else [b_i]  for el,prob in enumerate(prod_.env.b_reg_init.tolist()) ]
        b = prod_.env.get_product_belief(b_ind)
        vals += [max(prod_.val[(q,v)].alpha_mat.T * b).tolist()]
    plt.figure(0)

    plt.plot(np.linspace(0,1,20),sum(vals,[]))
    plt.ylabel('probability')
    ax = plt.gca()
    ax.set_xlim(0,1)
    ax.set_ylim(0,1)
    plt.show()

This will be included in the actual code ASAP.