ml4ai / delphi

Framework for assembling causal probabilistic models from text and software.
http://ml4ai.github.io/delphi
Apache License 2.0
24 stars 17 forks source link

Remaining For2Py Errors for SIR #340

Closed pauldhein closed 5 years ago

pauldhein commented 5 years ago

@pratikbhd, @hlim1, and @skdebray unfortunately there seems to still be bugs in the For2Py code that was merged into the SIR-Demo branch yesterday. I went ahead and updated our GrFN testing file delphi/tests/aske/test_GrFN.py to include tests for the SIR fortran models that we need for the demo. Running this file via the command pytest tests/aske/test_GrFN.py should allow you to see the bugs I have found with the various files.

@pratikbhd and @hlim1, please pull the latest from the SIR-Demo branch and run the test command above. After doing that please fix all bugs that appear as part of the For2Py pipeline and notify me on this issue once you are done. There still may be bugs in my portion of the pipeline that I need to work on after you have completed this task.

hlim1 commented 5 years ago

@pauldhein I just pulled and ran it. I will work with Pratik to get this done as a top priority to get it done ASAP. We'll let you know immediately once they are fixed.

pauldhein commented 5 years ago

Update

@pratikbhd and @hlim1 I see that you have made some progress on fixing the bugs in the PA pipeline. I too have been fixing bugs with the testing code in order to allow the next round of bugs to float to the surface.

Decision Wiring Bug

Below is a screenshot of the PETPT GrFN generated using our new pipeline:

Screen Shot 2019-08-30 at 10 14 21 AM

As you can see the decision statements are not wiring properly. I believe this is my fault for adding confusion to how we denote decision statements. We agreed to use the naming schema of var_name__0 and var_name__1 for the false and true decisions respectively. However, we should not use that naming schema for the GrFN wiring in the JSON, that should only be used for the lambdas. This becomes apparent in the PETPT model. As seen in the figure, eo_2 and eo_3 are not properly connected to eo_4. Fixing this bug will fix the execution for the PETPT and PETASCE models. Please let me know when this change has been made!

Unhandled Fortran Primitive Function

I want to move away from using pyrand() and instead use Fortran's own rand() function for the demo. We won't actually need to make calls to this function, but For2Py will need to be able to see this function and convert it to the python equivalent. Doing this translation involves adding the two following elements to the lambdas file:

  1. Add the following import line to the head of the lambdas file:
    from random import random
  2. Replace the Fortran call to rand() with random()

That's all the bugs I have for now, but please report back here as you fix bugs and I will add more bugs here as I find them.

hlim1 commented 5 years ago

@pauldhein Since I considered fixing "Unhandled Fortran Primitive Function" might be a quicker fix than the other, I've handled the bug first. Now, whenever rand() is seen in the Fortran source code, it will be translated to random() Python function at the level of Python IR generation. Then, the genPGM will simply add from random import random line at the top of the lambda file. Thus, for example, Fortran code if (rand() < (rateInfect/totalRates)) then will generate a lambda function like below.

def SIR_Gillespie_MS__model__condition__IF_0__0(rateinfect: Real, totalrates: Real):
    return (random() < (rateinfect/totalrates))

I will let you know as we fix bugs. And, yes, please let us (@pratikbhd and I) know as find more.

Thanks!

pauldhein commented 5 years ago

Awesome, thanks @hlim1!

pratikbhd commented 5 years ago

@pauldhein , the Decision Wiring Bug has been fixed. Please pull the latest code and check if it is fixed on your side as well.

pratikbhd commented 5 years ago

Bug

The following bug arises on networks.py when running crop_yield.f.

  File "/Users/pratikbhandari/delphi/GrFN/networks.py", line 176, in <listcomp>
    if d == 0 and self.nodes[n]["type"] == "variable"
KeyError: 'type'

I tried printing out self.nodes to see its contents. Looks like it is a NodeView object? (Not sure).. @pauldhein, can you look into this and let me know what the contents of self.nodes should be and why types was not found?

adarshp commented 5 years ago

I tried adding the following lines above that line:

        for n, d in self.nodes(data=True):
            if len(d) == 0:
                print("Node without data: ", n, d)

Which showed that a couple of nodes in the crop_yield GrFN don't have data:

Node without data:  crop_yield::crop_yield::loop$0::0::total_rain::-1 {}
Node without data:  crop_yield::crop_yield::loop$0::0::yield_est::-1 {}
pratikbhd commented 5 years ago

Are the two nodes above supposed to appear in the GrFN or lambda files? If so, they are not currently produced in both of the files. Also, the names of the nodes seem weird in terms of the GrFN spec (the ::0:: in particular).

pauldhein commented 5 years ago

@adarshp thanks for the sleuthing! It seems the bug is coming from the handling of the updated list in crop_yield. After saving the GrFN JSON for crop_yield I found the following being generated for the crop_yield::crop_yield::loop$0 container:

{
  "name": "@container::crop_yield::crop_yield::loop$0",
  "source_refs": [],
  "gensym": "c_f1d67f9fc82c",
  "repeat": true,
  "arguments": [
    "@variable::yield_est::-1",
    "@variable::total_rain::-1",
    "@variable::consistency::-1",
    "@variable::absorption::-1",
    "@variable::max_rain::-1",
    "@variable::rain::-1"
  ],
  "updated": [
    "@variable::rain::0",
    "@variable::total_rain::-1",
    "@variable::yield_est::-1"
  ],
}

The two variables in the updated list that have -1 as their index represent an error (no update the index is recorded). @pratikbhd please fix this to ensure that the updated variables list always tracks the index of the variables at the end of the container.

pauldhein commented 5 years ago

@pratikbhd and @hlim1 another bug ...

As noted above, the expected list of arguments for crop_yield::crop_yield::loop$0 in crop_yield is:

"arguments": [
  "@variable::yield_est::-1",
  "@variable::total_rain::-1",
  "@variable::consistency::-1",
  "@variable::absorption::-1",
  "@variable::max_rain::-1",
  "@variable::rain::-1"
],

However, the call statement made to this container has the following input list:

"input": [
  "@variable::total_rain::1",
  "@variable::yield_est::1",
  "@variable::rain::0",
  "@variable::max_rain::1",
  "@variable::consistency::1",
  "@variable::absorption::1"
],

These two lists do not match, and they need to, as this enforces the calling convention between the two containers. The same should be true for the updated and return_value lists. Please fix this bug as well.

pauldhein commented 5 years ago

Also, the names of the nodes seem weird in terms of the GrFN spec (the ::0:: in particular).

@pratikbhd this is a new index that I add to the container name portion of the variable name to aid in GrFN formation during the translation (and another good example of why GrFN JSON is not a full specification of a GrFN but rather a compressed representation).

In a generic source code model/program we will likely see containers that are called multiple times. In GrFN JSON we record the specification for a container once, and then may have multiple call statements for the container. During wiring, we need to create unique nodes in the graph for every instance of a container call (that's why variable names include the name of the namespace and scope). But if we have the same container called multiple times then when creating the graph we need another index to track which call the specific set of inputs/outputs are going to wire to/from. Furthermore the nodes inside the container (both variable and function nodes) need an index that keeps them unique from the nodes created for another call to the same container. Thus I introduced this call_index to solve the multi-call problem. Sorry for the added confusion.

pratikbhd commented 5 years ago

@adarshp thanks for the sleuthing! It seems the bug is coming from the handling of the updated list in crop_yield. After saving the GrFN JSON for crop_yield I found the following being generated for the crop_yield::crop_yield::loop$0 container:

{
  "name": "@container::crop_yield::crop_yield::loop$0",
  "source_refs": [],
  "gensym": "c_f1d67f9fc82c",
  "repeat": true,
  "arguments": [
    "@variable::yield_est::-1",
    "@variable::total_rain::-1",
    "@variable::consistency::-1",
    "@variable::absorption::-1",
    "@variable::max_rain::-1",
    "@variable::rain::-1"
  ],
  "updated": [
    "@variable::rain::0",
    "@variable::total_rain::-1",
    "@variable::yield_est::-1"
  ],
}

The two variables in the updated list that have -1 as their index represent an error (no update the index is recorded). @pratikbhd please fix this to ensure that the updated variables list always tracks the index of the variables at the end of the container.

This has been fixed. Please pull the latest code and verify if the issue is solved and let me know if it still persists.

pratikbhd commented 5 years ago

@pratikbhd and @hlim1 another bug ...

As noted above, the expected list of arguments for crop_yield::crop_yield::loop$0 in crop_yield is:

"arguments": [
  "@variable::yield_est::-1",
  "@variable::total_rain::-1",
  "@variable::consistency::-1",
  "@variable::absorption::-1",
  "@variable::max_rain::-1",
  "@variable::rain::-1"
],

However, the call statement made to this container has the following input list:

"input": [
  "@variable::total_rain::1",
  "@variable::yield_est::1",
  "@variable::rain::0",
  "@variable::max_rain::1",
  "@variable::consistency::1",
  "@variable::absorption::1"
],

These two lists do not match, and they need to, as this enforces the calling convention between the two containers. The same should be true for the updated and return_value lists. Please fix this bug as well.

@pauldhein This has been fixed. Please pull the latest code and verify if the issue is solved and let me know if it still persists.

pratikbhd commented 5 years ago

Bug

The following bug arises when I run pytest tests/aske/test_GrFN.py:

  File "/Users/pratikbhandari/delphi/GrFN/networks.py", line 331, in process_call_statement
    (parent, var_name, idx) = inputs[inp]
KeyError: '@variable::t::-1'

The error above results for the SIR-Gillespie-SD_inline.f file.

I tried debugging around a little bit by looking at how the functions were passed around to see if it's a straightforward thing to fix but couldn't figure it out and didn't want to make incorrect assumptions. @pauldhein, any idea what is happening here?

pauldhein commented 5 years ago

@pratikbhd the bug above is caused by a miscommunication in the definition of input arguments (or variables that have an index of -1). This error is caused when processing the call to @container::SIR-Gillespie-SD_inline::gillespie::loop$1 during wiring. I have reproduced the JSON emitted by the PA module for this call below:

Screen Shot 2019-08-31 at 2 35 19 PM

All of the variables highlighted below in the input list are marked with an index of -1, declaring them to be input arguments to the gillespie container, but these are not input arguments, and in fact they have not been defined yet, so they should not be marked as such. These variables will eventually be defined in @container::SIR-Gillespie-SD_inline::gillespie::loop$1. Allow them to be defined there and remove the entries for these variable names from the associated input and argument lists.


Now to the bigger question: how do we decide when a variable has been defined?

My thoughts on this are that a variable is officially defined the first time we see it on the left-hand side of an equals sign. So even though in Fortran all of the variables appear at the top of the gillespie subroutine the first time we see an instance of t is inside loop$1 and therefore it is officially defined in loop$1 as opposed to being an input to loop$1. Does that make sense?

cl4yton commented 5 years ago

My 2 cents: I think the "location" and "time" that a variable is defined is when it is first declared, which may be before it is first assigned. My rationale is that that declaration determines the variable scope. E.g., we could have a variable that is declared at the top-level but then not first assigned until within a function; if we only considered the "first definition" to be with the assignment, then don't we risk loosing the variable's true scope information?

pauldhein commented 5 years ago

Hmmm. I do see @cl4yton's point, but in that case, we need explicit declaration statements for all of the variables defined in a scope at their definition (likely at the beginning of the container).

Here's why: when I see @variable::t::-1 in the input list of a call to loop$1 (a statement that appears in the body of the gillespie container), that tells me that @variable::t::-1 is actually not defined in gillespie and I need to go one scope further out in order to find the variable definition. What we need is for @variable::t::0 to be defined at the beginning of the gillespie container if we want it to have the scope of the full container (which it should).

cl4yton commented 5 years ago

Got it, that seems right, @pauldhein . I'm not sure I know what it means for a variable to "be defined at he beginning of [a] container", as the "occurrence" within wiring is based on assignment. My main point is to just ensure that the scope of the variable gets captured based on the variable declaration.

So at this point, I don't have further opinion, so would like @skdebray @pratikbhd @hlim1 to speak to how this should be from the PA perspective.

pratikbhd commented 5 years ago

@pauldhein I see the issue that is causing networks.py to fail and I agree that it needs to be fixed. Variables like @variable::t::-1 are not an input to the loop$1 container and should not be in the inputs and arguments field.

The reason they are there right now is that I was using a naive heuristic of looking at all variables that are on the right side of an assignment operation and binning them as inputs. This will not always be true since sometimes a variable can appear on the right side and still have its value defined within the container (such as variable t in loop$1). There will be further complexities to handle when dealing with such situations. For example in the code below:

t = 4
for i in range(0, x):
    if y<2:
        t = 2.5
    z = t + 2

Here, the value of t in z = t+2 will either be 2.5 or 4 depending on the value of the IF condition. In such a case, how do we statically decide whether t is an input to the loop container or not? It will be an input to the loop container if y<2 is False. Otherwise, it will not be an input to the loop container. This is something we can think about and decide in the weeks to come but for now (handling SIR-Gillespie), we don't have to face with such situations and can proceed with not having variables like t as input to the loop container loop$1.


As for deciding when a variable is defined, I agree with @cl4yton's notion of preserving the scope of where a variable is declared. In the intermediate Python source code of the original Fortran code, we have a variable declaration of the following form:

t: List[float] = [None]

Whenever such statements are encountered, the variable is given an index 0 and will be declared as @variable::t::0 as suggested by @pauldhein. The first assignment of this variable will then result in its index being incremented to 1 only if that is done within the same container. However, in SIR-Gillespie-SD_inline.f, t is declared in gillespie and defined for the first time in loop$1 (another container). This results in t being defined as @variable::t::0 within the loop$1 container like this:

@variable::SIR-Gillespie-SD_inline::gillespie.loop$1::t::0

Does this indexing scheme work? Please let me know if this doesn't make sense and I need to further elaborate.

For now, I'll work on removing the variables which are not inputs to the loop$1 container and fix its indexing as well.

cl4yton commented 5 years ago

Hi @pratikbhd, to address your first question, for the example:

t = 4
for i in range(0, x):
    if y<2:
        t = 2.5
    z = t + 2

You said:

Here, the value of t in z = t+2 will either be 2.5 or 4 depending on the value of the IF condition. In such a case, how do we statically decide whether t is an input to the loop container or not? It will be an input to the loop container if y<2 is False. Otherwise, it will not be an input to the loop container.

Response: Here, t is definitely an input to the loop container, it just only gets referenced by one of the conditional branches and not the other.

For the second question, re. the earlier declaration of t that doesn't get assigned until within a loop container. I believe we should not start indexing variables until they are assigned values; prior to that, there's no value to keep track of, and I think that indices should play the role of tracking values. So what we do need is to capture the scope of the initial declaration, but then start indexing once assignments are made, starting with 0 for the first. Does that make sense @pratikbhd and @pauldhein ?

pauldhein commented 5 years ago

I concur with @cl4yton.

pratikbhd commented 5 years ago

That does make sense @cl4yton, thank you. I will make the required changes to reflect this indexing.

cl4yton commented 5 years ago

Excellent, thank you @pratikbhd and @pauldhein .

To echo what @skdebray noted before: these choice NEED to be documented. They aren't currently captured in the GrFN openapi part of the specification. They need to go into the GrFN specification document -- which I intend to be the "verbose" part of what originally was the GrFN_spec but is now being split into (a) the openapi specification and (b) what should be the complementary GrFN documentation. I am going to try to get a skeleton of the latter up into the repo today, but I am going to ask you both to please fill in some of these details (in places I hope to point you to). Me being the bottleneck on this documentation is a real problem, so hopefully this will be a near-term solution until I can get to it properly.

pratikbhd commented 5 years ago

Error

@pauldhein, after fixing the indexing issue discussed above, I ran test_GrFN.py on pytest and got the following error:

==================================== ERRORS ====================================
_____________ ERROR at setup of test_petpt_creation_and_execution ______________

    @pytest.fixture
    def petpt_grfn():
>       return GroundedFunctionNetwork.from_fortran_file("tests/data/program_analysis/PETPT.for")

tests/aske/test_GrFN.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
delphi/GrFN/networks.py:466: in from_fortran_file
    fortran_file, mode_mapper_dict)
delphi/GrFN/networks.py:447: in from_python_src
    return cls.from_dict(pgm_dict, lambdas)
delphi/GrFN/networks.py:412: in from_dict
    return cls(G, scope_tree, returns+updates)
delphi/GrFN/networks.py:178: in __init__
    self.function_sets = self.build_function_sets()
delphi/GrFN/networks.py:65: in build_function_sets
    find_distances(initial_funcs, 0)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
...
...
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:63: in find_distances
    find_distances(list(set(all_successors)), dist + 1)
delphi/GrFN/networks.py:61: in find_distances
    all_successors.extend(self.call_graph.successors(func))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <networkx.classes.digraph.DiGraph object at 0x7ff8425bfeb8>
n = 'PETPT::@global::petpt::0::PETPT__petpt__assign__eo__0'

    def successors(self, n):
        """Return an iterator over successor nodes of n.

        A successor of n is a node m such that there exists a directed
        edge from n to m.

        Parameters
        ----------
        n : node
           A node in the graph

        Raises
        -------
        NetworkXError
           If n is not in the graph.

        See Also
        --------
        predecessors

        Notes
        -----
        neighbors() and successors() are the same.
        """
        try:
>           return iter(self._succ[n])
E           RecursionError: maximum recursion depth exceeded while calling a Python object

../../../../../.python_venv/venv_delphi/lib/python3.7/site-packages/networkx/classes/digraph.py:794: RecursionError

Any idea why this is happening?

pauldhein commented 5 years ago

@pratikbhd I am pretty sure that the problem has something to do with this indexing being generated in the PETPT JSON:

{
  "function": {
    "name": "PETPT__petpt__assign__albedo__0",
    "type": "lambda"
  },
  "input": [
    "@variable::msalb::-1"
  ],
  "output": [
    "@variable::albedo::0"
  ],
  "updated": []
},
{
  "function": {
    "name": "PETPT__petpt__assign__albedo__0",
    "type": "lambda"
  },
  "input": [
    "@variable::msalb::-1",
    "@variable::xhlai::-1"
  ],
  "output": [
    "@variable::albedo::0"
  ],
  "updated": []
},
{
  "function": {
    "name": "PETPT__petpt__decision__albedo__0",
    "type": "lambda"
  },
  "input": [
    "@variable::albedo::0",
    "@variable::albedo::0",
    "@variable::IF_0::0"
  ],
  "output": [
    "@variable::albedo::0"
  ],
  "updated": []
},
pratikbhd commented 5 years ago

Just fixed the above issue.

adarshp commented 5 years ago

Closing this as the issue is resolved.