whole-tale / wt-prov-model

Experiments, design documents, and prototypes supporting a provenance model for Tales and runs.
MIT License
0 stars 1 forks source link

Views of extracted ReproZip facts #8

Open tmcphillips opened 4 years ago

tmcphillips commented 4 years ago

Query and visualization of the ReproZip trace, and integration with other sources of provenance information, can be facilitated by providing (and materializing) alternative views of the information.

For example, it will be helpful to create (Prolog) views of the ReproZip trace that clearly distinguish between files that are opened for reading vs writing. This distinction currently is apparent in the access Mode in each rpz_opened_file fact. Similarly, directory accesses can be distinguished from (non-directory) file accesses.

So, the following from the current 05-cat-file-to-file example...

% FACT: rpz_opened_file(FileID, RunID, ProcessID, File, Mode, IsDirectory, Timestamp).
rpz_opened_file(f30, r0, p2, "/mnt/c/Users/tmcphill/OneDrive/GitRepos/wt-prov-model/examples/05-cat-file-to-file/outputs/output.txt", 2, false, nil).
rpz_opened_file(f35, r0, p2, "/mnt/c/Users/tmcphill/OneDrive/GitRepos/wt-prov-model/examples/05-cat-file-to-file/inputs/input.txt", 1, false, nil).

...could be represented (with relative paths) as:

% FACT: wt_process_wrote_file(ProcessID, RunID, FileID, File, Timestamp).
wt_process_wrote_file(p2, r0, f30, "./outputs/output.txt", nil).

% FACT: wt_process_read_file(ProcessID, RunID, FileID, File, Timestamp).
wt_process_read_file(p2, r0, f35, "./inputs/input.txt", nil).

Portions of the trace can be hidden when materializing the views as well. For example, the traces can be greatly simplified by ignoring the process corresponding to the run.sh script--and the files it accesses--leaving only the processes started by this script, and their children, in the trace.