scallop-lang / scallop

Framework and Language for Neurosymbolic Programming. Join Our Discord: https://discord.gg/RavzdND229
https://www.scallop-lang.org
MIT License
182 stars 9 forks source link

Scallopy custom provenance requires implementing three functions not specified in provenance.py #5

Closed susuzheng closed 1 year ago

susuzheng commented 1 year ago

Hi, we are building a customized provenance. While implementing all required functions instructed in provenance.py, we were stubbed by error messages like

pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: PyErr { type: <class 'AttributeError'>, value: AttributeError("'CustomProbabilityProvenance' object has no attribute 'tagging_fn'"), traceback: None }

Three of them are tagging_fn, recovery_fn, and discard. We had to implement them without readily knowing what they are for.

This error can also be reproduced by running the provided examples that contain customizing a provenance.

Liby99 commented 1 year ago

Thanks for your interest on building customized provenance! We have a detailed explanation on the tagging_fn, recover_fn, and discard functions in our upcoming paper. Here I will provide a quick explanation:

In terms of the tagging_fn and recover_fn, the idea is that we want to separate the tags accepted by Scallop into 3 categories: a) input tags (I), b) internal tags (T), and c) output tags (O). The tagging function is of signature I -> T, and the recover function is of signature T -> O. All the provenance operations (add, mult, zero, one, negate, aggregate) are on internal tags T.

If you are building your customized provenance, I suppose you have already designed your internal tag space T. In this case, you can assume that I = T = O and simply implement tagging and recover function to be the identity function.

In terms of the discard function, it is for early-removal of facts during computation process. For example, if you are storing probabilities as tags, a fact with probability 0 could potentially be removed in the computation since it is not possible and will not contribute to the final result. In this case, you can implement discard as

def discard(self, tag):
  return tag == 0

That is, when the tag is 0, the function returns True, indicating that we can discard the fact. In case you don't want any tuple to be discarded, you can have the function always return False.

Again, thank you for bringing these up. While I just proposed some quick explanations and solutions, I realize that we should provide default implementations to these functions (e.g. having identity functions for tagging_fn and recover_fn, having discard to always return False). We will provide these features in an upcoming version.

susuzheng commented 1 year ago

Thank you very much for the detailed and timely response! I understand and am unblocked now. It'll be nice to have some comments in the source code for quick reference in future releases, as you have always been doing!