Should we use smart constructor for types with invariants in the `plutus-ledger-api`?

thealmarty commented 2 years ago

The purpose of this issue is to further the discussion in issue #4476 about adding smart constructors to some types exposed in plutus-ledger-api to validate invariants. The following has been suggested:

Naming these constructors unsafe.
Only expose smart constructors that are validated.

Currently, we make the deliberate choice of keeping the types simple. See some of the rationale here. Some of the reasons include:

Some types have invariants that we can't possible check all of, so we can't consistently do that for all types with invariants.
There's a danger of giving the developer an impression that all invarants are all checked - ultimately, only the ledger can do that.
It's costly to do, we need to work it out, write it out, test it thoroughly, and maintain it. Should we use that time to improve Plutus in other ways instead?

Of course, we could take a hybrid approach and validate some types with simple invariants. However, this could be confusing.

We welcome your input.

Benjmhart commented 2 years ago

Frankly, this is a disasterous API decision with serious security consequences.

if an invariant can be checked, then there should be a mechanism supplied by the plutus library to do so, otherwise we are inviting bugs and attack vectors into smart contracts. Do you really expect every plutus project to implement their own checks?

I'd like to know more about the invariants which can't be checked. can you please specify @thealmarty ?

thealmarty commented 2 years ago

Thanks for your input Ben. Adding some functions to check invariants that can be checked is one of the things we are considering.

To answer your question, as linked above, from the discussion in #2941:

The non-local conditions on the transaction as a whole would be quite difficult to replicate without duplicating the entirety of the ledger rules. Examples:

A type/smart constructors for valid CurrencySymbols: easy enough

A smart constructor for TxInfo that checks that the transaction balances etc. etc.: difficult without replicating the ledger rules entirely

peter-mlabs commented 2 years ago

Most of my opinions have been voiced here, but I'll summarize:

Better types (such as NonEmpty for non-empty lists) should be the first focus.
Newtypes with smart constructors and documented invariants should be the second.
Generators and validation functions are important for testing
All of the above is essential for developer experience, the on-boarding story, security, and code-as-documentation
As @zygomeb has point out elsewhere, undocumented invariants become de facto standards when official implementations make use them. This has already happened, and is costly both in financial and temporal terms.

With respect to the points raised here:

Some types have invariants that we can't possible check all of, so we can't consistently do that for all types with invariants.

There is a large swatch of invariants in cardano-ledger/#2491 that could be checked. There's also a difference between "stateful" invariants that would need some mock ledger to check against (such as the invariant the UTxO's used as inputs actually exist on the ledger) and "stateless" invariants (such as the txInfoFee :: Value needing to be positive).

There's also a difference between stateless invariants that are difficult to represent due to poor typing (such as as the txInfoFee being positive, or unnormalized Values) and stateless invariants that are tedious (but not difficult) to check (such as Datum <-> DatumHash coherence, or checking the a value referenced in a ScriptPurpose actually appears in the corresponding TxInfo). The former is an obvious candidate for better types, and the latter is an obvious candidate for validation functions and/or smart constructors.

It's costly to do, we need to work it out, write it out, test it thoroughly, and maintain it. Should we use that time to improve Plutus in other ways instead?

This is needed by teams in my organization and we've already both built parts of this functionality and work-arounds ourselves and submitted PRs to this repository to include similar functionality upstream. If IOHK will not take responsibility for this, it will simply be pushing the burden downstream where the community cannot benefit from centralized work. The fact that three separate teams in my organization have independently taken steps to remedy this is an indication to me that this is an important enough issue to be considered an improvement to plutus.

There's a danger of giving the developer an impression that all invariant are all checked - ultimately, only the ledger can do that.

Of course, we could take a hybrid approach and validate some types with simple invariants. However, this could be confusing.

At this point, why use types at all?

I don't think this should be a concern. Its akin to saying its useless to wear a helmet on your bike because you know it won't save you if you get struck by a bus; the point is to prevent a concussion if you hit a pothole and take a spill.

I'm failing to see, for instance, how using a Natural (or a smart-constructed wrapped Integer) for the txOutRefIdx could possibly be more confusing than using an Integer.

peter-mlabs commented 2 years ago

To answer your question, as linked above, from the discussion in https://github.com/input-output-hk/cardano-ledger/issues/2941:
The non-local conditions on the transaction as a whole would be quite difficult to replicate without duplicating the entirety of the ledger rules. Examples:
   A type/smart constructors for valid CurrencySymbols: easy enough
  A smart constructor for TxInfo that checks that the transaction balances etc. etc.: difficult without replicating the ledger rules entirely

As discussed there, there is a a huge difference between "this TxInfo can never been seen under any conditions whatsoever because it's complete nonsense" and "this TxInfo can be seen under certain configurations, states, or contexts that may or may not be realistic or possible to achieve with current tooling".

L-as commented 2 years ago

I just want to comment, that in Plutarch, we check all invariants that can be checked. For example, we define PPubKeyHash as something that is exactly 28 bytes, and you can try converting a bytestring into that partially.

L-as commented 2 years ago

@thealmarty I'll try to explain shortly why the current decision is problematic: Imagine someone making a standard state machine (maybe with the unusable library), where you can pay into it, then get your funds back 100 days later + a bit extra which the state machine earns from staking using stake validators. It might keep track of the amounts paid into it using a merkle tree such that the transactions are of size O(log n) in the amount contained. What happens if I, when deposting 100 ADA into this state machine, which might contain 1000000 ADA, specify an incorrect return PubKeyHash/Address in the redeemer? 100 days later once the state machine has decided it's time to pay "me" back, it will fail to progress, because it is impossible to construct the transaction in question, since the hash is invalid! From the perspective of the developer, this kind of detail is certainly not impossible to notice, but it makes sense to me that we should strive to make it harder to shoot yourself in the foot, not easier, even if it means less efficiency on average. That's part of the ethos of Cardano from my perspective.

thealmarty commented 2 years ago

Before we de-rail further, I want to make it clear that we heard you. We know you want improvements here. We would like to collect all the inputs and think about the best way to solve the issues. Basically this is going to form the basis of a proposal. And it looks like the proposal will turn into an epic in JIRA consisting of many tickets. As far as I see a lot of these suggestions are reasonable, but of course, I alone cannot promise to act on everything.

@peter-mlabs you have written a lot of suggestions/action points here, I could take them and tidy up those. Or, if you prefer, you can send me a document(link to a git gist?), adding whatever else you thought of, perhaps including links of work your team may have already done. Other parties can perhaps also send clearly identified action points along with their priorities to speed up the process?

If anyone can think of any other ways for a more productive communication let me know. I think using written documents is better than discussing it in a meeting at this stage.

Benjmhart commented 2 years ago

@thealmarty - when can we expect solid commitment and such a proposal?

Benjmhart commented 2 years ago

frankly i would like to see the low hanging fruit resolved, there's an existing PR to add smart constructors and documentation.

i saw you mention that the docs would be cherry-picked, this is a start and should happen sooner rather than later to prevent developers from missing these problems when looking at haddock.

However the PR should see review and merge. if the testing is insufficient, that sounds like a great PR comment.

thealmarty commented 2 years ago

frankly i would like to see the low hanging fruit resolved, there's an existing PR to add smart constructors and documentation.

i saw you mention that the docs would be cherry-picked, this is a start and should happen sooner rather than later to prevent developers from missing these problems when looking at haddock.

However the PR should see review and merge. if the testing is insufficient, that sounds like a great PR comment.

Yes I'm working on the PR - I already cherry-picked the commits from #4597.

thealmarty commented 2 years ago

@thealmarty - when can we expect solid commitment and such a proposal?

We'd prefer to give the community a bit of time to add their voices. In the meantime more documentation will be added, as mentioned. When you make your list, tell us the priorities, that would be useful to us.

Benjmhart commented 2 years ago

we made our high-priority list in March - here: https://github.com/input-output-hk/plutus/issues/4476#issuecomment-1084411640

michaelpj commented 2 years ago

From the perspective of the developer, this kind of detail is certainly not impossible to notice, but it makes sense to me that we should strive to make it harder to shoot yourself in the foot, not easier, even if it means less efficiency on average.

Let me reiterate again on what the point of plutus-ledger-api is. It is the interface for the ledger. The main point of it is to correspond simply to what is going on in a transaction.

It is pretty clear to me that this is in tension with being maximally safe for dapp developers to use. It's always a tradeoff, and a big question is where to put the tradeoff. We could rewrite it all in Idris... but that probably wouldn't be a good tradeoff. And the correct tradeoff will differ for different users who have different safety tolerances and desires.

It sounds like MLabs have a) very strong desires for type-based safety, and b) want to do lots of generator-based testing that needs more invariants. So I think it would regardless be a good idea for someone (MLabs?) to write a wrapper library that opts for maximum safety. Especially if it also provided generators, I'm sure many people would enjoy using that. That's the joy of open-source :) But please don't assert that your own desires are the only legitimate ones.

Nonetheless I think we can still add marginal improvements in plutus-ledger-api on a case-by-case basis. My main requirement is that we do not replicate validation logic from the ledger. We also shouldn't change the underlying representation.

There are a lot of assumptions about what people will interpret various types to mean. Quoting @L-as from the other issue

If the newtypes don't mean anything, why not just make them type synonyms? That way there will be no confusion wrt. what they mean.

The assumption seems to be "a newtype indicates some kind of semantic difference, a type alias does not". I asked some of my colleagues, and this view seems far from universal. People generally seemed perfectly happy with the idea of a newtype that's just a lightweight tag to help you avoid mixing things up. Perhaps you are atypical here? :)

L-as commented 2 years ago

A type synonym will make it clear that the unsafeFromBuiltinData definitions are equivalent.

blamario commented 2 years ago

@michaelpj :

Let me reiterate again on what the point of plutus-ledger-api is. It is the interface for the ledger. The main point of it is to correspond simply to what is going on in a transaction.

If you're saying that types in plutus-ledger-api should correspond to those in plutus-ledger, what would be lost if the latter were made more precise as well?

peter-mlabs commented 2 years ago

I'm getting the impression that there's a divergence in expectations here, namely that this has become an IOG vs MLabs (and in some ways, a plutarch vs plutus-tx) issue. I agree that this isn't productive, but I think its important to correct this and understand exactly is intended by this issue.

There are a few things here that have left me feeling frustrated:

We would like to collect all the inputs and think about the best way to solve the issues.
followed by a thumbs down on Ben's question of when we can expect a commitment on this
But please don't assert that your own desires are the only legitimate ones.

Because the issue in question would be a large undertaking, and regardless of personal feelings on these issues, these are reasonable comments -- you would be doing a disservice to your own teams by chasing down every last MLab's beck-and-call without first doing the due diligence to solicit additional feedback and verifying that these sentiments are more widely shared.

But to this end: @michaelpj, @thealmarty, are either of you aware of the conversations take took place between IOG and the CDA, with Andrew Sutherland (@asutherlandus, I believe) and Simon (who's surname and GH handle I don't know) last week and last month regarding these issues?

During the first meeting when Simon was soliciting feedback on the developer experience, I had brought up an internal effort that Liqwid Labs (the client I PM for) had been undertaking to "de-folklore" certain parts of Haskell, Plutus, and Cardano. The reputation that IOG has gained among a number of my colleagues is that it is not worthwhile pushing documentation upstream, filing issues, or submitting PRs, because they will either be ignored, bike-shedded, or rejected. When I related this to Simon, he replied that IOG was committed to improving this process and had hired technical writers to specifically remedy the folklore issue, and additional technical staff which would allow IOG to pay more attention to these issues, docs, and PRs.

It was only after this that I submitted the cardano-ledger/#2941 issue, because it seemed like a perennial point of confusion that was once again being tackled. The conversation with Jared seemed very productive there.

Discussing this with the staffer who originally filed #4476 was met with skepticism, but everyone at MLabs was happy to see a more fleshed out discussion happening.

This was followed by another technical forum last Thursday, where Andrew commended me for having such a detailed discussion with Jared rather than a drive-by "this sucks" issue. We discussed phase 1 validation (including the serialization "validation"/invariants discussed on the cardano-ledger issue) for about half of the meeting, including both previously mentioned issues and PR #4597.

I quote (I have a recording of this full meeting, if you're interested):

"This is why I'm here: I'm trying as best I can to plug this gap and to get as much feedback going into the team -- that these are the real use cases, these are the problems people are having, we need to do something to address problems."

This forum was attended by 3 IOG staffers (Addie Girouard, Ardiana Saez, and Andrew Sutherland), 3 MLabs staffers ( Ben Hart, myself, and Maks Brodowicz), and 6 other representatives of CDA member (Cody Butz, David Hoglinger, Dominika Kopec, Emily Martins, Ishan Gupta, Jonas Lindgren).

It should be apparent by now where my frustrations stem from:

I was warned that pushing issues with IOG would quickly turn into quicksand
I was urged to try again, and had a productive conversation with Jared
I discussed this internally with MLabs staff, direct hires at Liqwid Labs, and additional consultants from Platonic Systems whom I manage
I discussed this at subsequent CDA meetings with (off the top of my head) Cody Butz and Jonas Lindgren, neither of whom are MLabs staff
I discussed this synchronously with IOG staff in public at a technical forum set up explicitly for this purpose
Colleagues at MLabs have fixed a number of these issues downstream with Plutarch
My teams has not once, not twice, but now three times written this functionality "in the dark" based on poorly-documented and poorly communicated assumptions

And now I am discussing this yet again, and we are simultaneously getting:

pushback that the issue may be and "MLabs problem" and that it needs further input from the community
being asked to generate action items
explicit dismissal that IOG is willing to take responsibility for using types that maintain invariants
concession that some of these types may be worthwhile to improve
refusals of firm commitment that this is going to be prioritized and merged

So, with no disrespect intended -- because I am genuinely operating under the impression that either myself or @thealmarty , @michaelpj are operating under a different set of expectations -- I'd like answers to the following questions before I or my staff spend any more time on this issue.

Were either @thealmarty or @michaelpj aware of the monthly IOG technical forum meetings, and if so, how are the contents of these communicated? We meet weekly to discuss amongst ourselves, and bring these issues to the IOG on a monthly basis -- if this is insufficient as a "voice of the community" for any reason (perhaps the CDA is not perceived as legitimate or diverse enough, or perhaps we should adopt a Robert's Rules-style motion-and-approval of resolutions), what would be a better venue?
If I, as a project manager, cannot raise these issues on behalf of my (multi-organizational) team without being told that this is "an MLabs issue", how exactly would you like these issues communicated? I can ask my staff to :+1: or comment, but I don't want it to appear as if we're brigading.

It seems like there is still a need for a mutually trusted communication channel to differentiate between "Peter or MLabs is filing this issue personally" and "Peter is filing this issue after spending hours discussing it elsewhere".

asutherlandus commented 2 years ago

There seems to be a lot of shouting on this thread and its making me feel sad. I can see that the user experience for developers trying to write very optimized Plutus scripts that are also correct is very challenging. The Plutus teams have over 300 issues in our backlog and this is just 1 of them. In many ways I am pleased that we are able to engage with you to have this discussion, I wish the tone was a little more friendly. I just interpret the tone as a very legitimate expression of your frustration.

From one perspective I look at plutus-ledger-api and think that its an internal interface between the ledger and the Plutus evaluator and wonder why we are talking about it. From the Perspective of the Plutus script it may represent an implicit specification of the ledger interface to UPLC. I think it would be helpful to really dig in to this and work out what the highest value problem to solve is. We need to rank the issue and put it in order with the 300 other items. Is the lack of documentation of the ledger interface to Plutus scripts the real problem here?

The community and IOG got burned previously by having different implementations of the ledger rules in the emulator. We risk the same issue with smart constructors. We somehow need to ensure they are consistent with the actual ledger implementation winch implies a test suite and some maintenance overhead. Its easy to argue the doctrine of smart constructors, I think we are saying that the doctrine of one source of truth overrides here.

Given that you have already implemented workarounds maybe this isn't the highest priority thing to be discussing. From my perspective having the interface between ledger and Plutus script be a byte string that the Plutus script must deserialize is the really big issue I wish we were talking about.

I hear your concern about how to best communicate with the development teams inside IOG. I share this frustration and that was why I was asking on the last CDA call about the experience with IOG's developer and training folks. I thought the CDA was supposed to be part of the solution, but it seems to be struggling to get traction.

I can't be the primary point of contact for every issue that needs to resolved. What I need to understand is what are the top 3 most important issues that will have the biggest benefit for developer experience.

L-as commented 2 years ago

I am honestly satisfied with #4814.

As far as I can tell, plutus-ledger-api is not supposed to be an internal API in any way, since it is in many official IOG tutorials on Plutus, not to mention that there isn't any alternative to its abstraction.

The core issue from my perspective was simply that it was very easy to be fooled into making insecure code. #4814 is OK IMO, since it makes it clear that you can't rely on unsafeFromBuiltinData to validate untrusted inputs.

I don't know what @peter-mlabs, @Benjmhart and co. feel, but I think we can close this issue if we're satisfied with this solution.

I'm still not convinced that using type synonyms isn't a better solution, but :shrug:. The big issue has been fixed IMO.

Thank you @thealmarty

peter-mlabs commented 2 years ago

I agree with Las here. Documentation is the biggest fix we need, and we can work around the rest. Thanks to all involved.

thealmarty commented 2 years ago

With #4814 merged, I'm closing this. We'll be adding more Haddock and tidying up the API in more follow up PRs.

uhbif19 commented 1 year ago

And the correct tradeoff will differ for different users who have different safety tolerances and desires.

I thought that plutus-ledger-api is only used for on-chain scripts implementation. Which are other kinds of users?

So I think it would regardless be a good idea for someone (MLabs?) to write a wrapper library that opts for maximum safety.

So far I could not found where are API breaking changes documented and what is policy about them overall. I have looked here, but multiple changes I am aware do not seem to be covered: https://github.com/input-output-hk/plutus/blob/master/plutus-ledger-api/CHANGELOG.md

I do not think that it is possible to maintain external wrapper library without such things documented.

effectfully commented 1 year ago

So far I could not found where are API breaking changes documented and what is policy about them overall. I have looked here, but multiple changes I am aware do not seem to be covered: https://github.com/input-output-hk/plutus/blob/master/plutus-ledger-api/CHANGELOG.md

We only started keeping track of changes at the beginning of 2023. Are the multiple changes that you're aware of recent enough? If so, we should do something about it indeed. Could you please list them?

uhbif19 commented 1 year ago

@effectfully

No, I all I know seems to be happened before 1.1.*. Because hydra only bumped version recently and plutus-script-utils seems to be incomparable to 1.2, I presumed that this was some recent changes.

If all API changes are covered CHANGELOG.md now, when I was wrong and now it is possible to maintain external wrapper.

IntersectMBO / plutus

Should we use smart constructor for types with invariants in the `plutus-ledger-api`? #4811