IntersectMBO / plutus

The Plutus language implementation and tools
Apache License 2.0
1.56k stars 477 forks source link

Provide expected result of compress and uncompress builtins for UPLC conformance #5814

Open michele-nuzzi opened 6 months ago

michele-nuzzi commented 6 months ago

Describe the feature you'd like

To facilitate independent implementations of builtins that can not leverage the usage of the same blst dependency used in this repository, it would be useful to provide the expected results of the builtins.

this is not currently the case, especially for builtins expecting bls12_381_G1_element and bls12_381_G2_element arguments like

given that, as per section C.3.4 of the plutus-core-specification pdf; these are always represented in the "uncompressed" form;

This makes it really hard to prove the result of the builtin is the one intended

Describe alternatives you've considered

No response

kwxm commented 6 months ago

Is the problem that you can't see the full result of the uncompress functions, including the y-coordinates? In Plutus Core that only ever exists in the in-memory representation, which we can't see (and in fact it might be considered an implementation detail since there are are a number of different ways of representing elliptic curve points: I think blst uses multiple internal representations internally). We do have some conformance tests which might be of some help, but they're far from exhaustive.

If you're trying to implement some of this stuff independently then the best I can suggest is that you write a small standalone program using blst that does full deserialisation (ie, not the compressed version that we use) and compare the results with your own.

kwxm commented 6 months ago

as per section C.3.4 of the plutus-core-specification pdf; these are always represented in the "uncompressed" form

That may be a typo, but we always use the compressed form. Plutus script sizes are quite limited (16384 bytes at most ) because they have to be stored on the chain. We did some experiments and it was clear that the time/space tradeoffs were much better for the compressed form.

michele-nuzzi commented 6 months ago

yes, it would turn useful to have the various representations ( I understand there are at least 3: affine, cartesian and uncompressed )

exactly because different representations are used it would turn useful to be explicit on what the result should be

That may be a typo, but we always use the compressed form.

Yes, I was referring to C.3.4 just for the textual representation, which always shows uncompressed instead.

effectfully commented 6 months ago

This issue says that it was triaged, but there's no indication of how it was triaged. Is it considered low priority? Are we waiting for more info or is this something that we want to follow up on soon? Just commenting on an issue is not sufficient to triage it.

I'm therefore returning the status: needs triage label.

effectfully commented 5 months ago

@michele-nuzzi

it would be useful to provide the expected results of the builtins.

Provide through what means? We have the conformance tests (were linked above by Kenneth). They are not exhaustive, but should be sufficient for ensuring that the correct representation is used by an alternative implementation.

Do you want us to include that in the spec?

@kwxm

(and in fact it might be considered an implementation detail since there are are a number of different ways of representing elliptic curve points: I think blst uses multiple internal representations internally)

But it's not convoluted enough for us to make using blst a necessity, right? I.e. it's not on the, say, GMP's level of fanciness?

It sounds to me like the compress builtins need to be in spec, because their result, being a ByteString, is serializable and pretty-printable and so there has to be an agreement on what that result is supposed to be. I'm not sure about uncompress, is it the case that any uncompress works as long as it produces a valid Element regardless of the representation? If so, we don't need to have it in the spec, but if BLS12_381.G2 and other builtins require a specific representation, then that has to be in the spec.

Does that make sense?

I'm going to say that this is a low priority issue for now.