End-to-end-provenance / RDataTracker

An R library to collect provenance from R scripts.
http://end-to-end-provenance.github.io/
GNU General Public License v3.0
39 stars 6 forks source link

Data to procedure edges not being produced when called from a function #444

Closed jwons closed 5 years ago

jwons commented 6 years ago

When calling ddg.run() from porvDebugR the data to procedure edges are not being saved. When called using ddg.run() in the console, they are.

blernermhc commented 5 years ago

Here is an example script reproducing the problem:

a <- 3
b <- 2
cc <- a + b

With debug.init("SuperSimple.R"), there is no edge from a to cc <- a + b but when run with rdtLite::prov.run("SuperSimple.R"), there is. The edge from b to the cc assignment statement is there in both cases.

With debugging output on, we see this from debug.init:

[1] ".ddg.parse.commands: Processing cc <- a + b"
[1] ".ddg.parse.commands: Evaluating  cc <- a + b"
[1] ".ddg.parse.commands: Done evaluating  cc <- a + b"
[1] ".ddg.parse.commands: Adding operation node for cc <- a + b"
[1] "Adding procedure node 5 named cc <- a + b"
[1] "proc.node: Operation cc <- a + b"
[1] "Adding control flow edge 6 for p4 to p5"
[1] "proc2proc:  b <- 2   cc <- a + b"
[1] "CF p4 p5"
[1] ".ddg.parse.commands: Adding cc <- a + b information to vars.set"
[1] "Adding data flow in edge 7 for d2 to p5"
[1] "data2proc: b cc <- a + b"
[1] "DF d2 p5"
[1] ".ddg.parse.commands: Adding input data nodes for cc <- a + b"
[1] "Adding data node 3 named cc with scope R_GlobalEnv  and value  5"
[1] "data.node: Data cc"
[1] "Adding data flow out edge 8 for p5 to d3"
[1] "proc2data: cc <- a + b cc"
[1] "DF p5 d3"
[1] ".ddg.parse.commands: Adding output data nodes for cc <- a + b"

From rdtLite, we see:

[1] ".ddg.parse.commands: Processing cc <- a + b"
[1] ".ddg.parse.commands: Evaluating  cc <- a + b"
[1] ".ddg.parse.commands: Done evaluating  cc <- a + b"
[1] ".ddg.parse.commands: Adding operation node for cc <- a + b"
[1] "Adding procedure node 5 named cc <- a + b"
[1] "proc.node: Operation cc <- a + b"
[1] "Adding control flow edge 6 for p4 to p5"
[1] "proc2proc:  b <- 2   cc <- a + b"
[1] "CF p4 p5"
[1] ".ddg.parse.commands: Adding cc <- a + b information to vars.set"
[1] "Adding data flow in edge 7 for d1 to p5"
[1] "data2proc: a cc <- a + b"
[1] "DF d1 p5"
[1] "Adding data flow in edge 8 for d2 to p5"
[1] "data2proc: b cc <- a + b"
[1] "DF d2 p5"
[1] ".ddg.parse.commands: Adding input data nodes for cc <- a + b"
[1] "Adding data node 3 named cc with scope R_GlobalEnv  and value  5"
[1] "data.node: Data cc"
[1] "Adding data flow out edge 9 for p5 to d3"
[1] "proc2data: cc <- a + b cc"
[1] "DF p5 d3"
[1] ".ddg.parse.commands: Adding output data nodes for cc <- a + b"
blernermhc commented 5 years ago

The search list looks the same in both cases.

The results of search() when looking for a in debug.init is:

[1] ".ddg.create.data.use.edges: var = a"
 [1] ".GlobalEnv"        "package:ggplot2"   "tools:rstudio"     "package:stats"    
 [5] "package:graphics"  "package:grDevices" "package:utils"     "package:datasets" 
 [9] "package:methods"   "Autoloads"         "package:base"     

The results of search() when looking for a in prov.run is:

[1] ".ddg.create.data.use.edges: var = a"
 [1] ".GlobalEnv"        "package:ggplot2"   "tools:rstudio"     "package:stats"    
 [5] "package:graphics"  "package:grDevices" "package:utils"     "package:datasets" 
 [9] "package:methods"   "Autoloads"         "package:base"     
blernermhc commented 5 years ago

In debug.init, a is defined in the same environment as many other things, like:

[1] ".ddg.create.data.use.edges: var = a"
   [1] "%--%"                             "%->%"                            
   [3] "%<-%"                             "%>%"                             
   [5] "%c%"                              "%du%"                            
   [7] "%m%"                              "%s%"                             
   [9] "%u%"                              "a"                               
  [11] "absolutePanel"                    "actionButton"                    
  [13] "actionLink"                       "add_edges"                       
  [15] "add_layout_"                      "add_shape"