Closed susuzheng closed 1 year ago
Thanks for your interest on building customized provenance! We have a detailed explanation on the tagging_fn
, recover_fn
, and discard
functions in our upcoming paper. Here I will provide a quick explanation:
In terms of the tagging_fn
and recover_fn
, the idea is that we want to separate the tags accepted by Scallop into 3 categories: a) input tags (I
), b) internal tags (T
), and c) output tags (O
). The tagging function is of signature I -> T
, and the recover function is of signature T -> O
. All the provenance operations (add
, mult
, zero
, one
, negate
, aggregate
) are on internal tags T
.
If you are building your customized provenance, I suppose you have already designed your internal tag space T. In this case, you can assume that I = T = O and simply implement tagging and recover function to be the identity function.
In terms of the discard
function, it is for early-removal of facts during computation process. For example, if you are storing probabilities as tags, a fact with probability 0 could potentially be removed in the computation since it is not possible and will not contribute to the final result. In this case, you can implement discard as
def discard(self, tag):
return tag == 0
That is, when the tag is 0, the function returns True
, indicating that we can discard the fact. In case you don't want any tuple to be discarded, you can have the function always return False
.
Again, thank you for bringing these up. While I just proposed some quick explanations and solutions, I realize that we should provide default implementations to these functions (e.g. having identity functions for tagging_fn
and recover_fn
, having discard
to always return False
). We will provide these features in an upcoming version.
Thank you very much for the detailed and timely response! I understand and am unblocked now. It'll be nice to have some comments in the source code for quick reference in future releases, as you have always been doing!
Hi, we are building a customized provenance. While implementing all required functions instructed in provenance.py, we were stubbed by error messages like
Three of them are
tagging_fn
,recovery_fn
, anddiscard
. We had to implement them without readily knowing what they are for.This error can also be reproduced by running the provided examples that contain customizing a provenance.