IntersectMBO / plutus

The Plutus language implementation and tools
Apache License 2.0
1.57k stars 480 forks source link

On-chain function to convert from Integer to ByteString, or to parse ByteString into Integer #3657

Open longngn opened 3 years ago

longngn commented 3 years ago

Area

Describe the feature you'd like

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context / screenshots

Add any other context or screenshots about the feature request here.

catch-21 commented 3 years ago

Hi, thanks @longngn. This is actually already being worked on to allow for Marlowe scalability improvements. I'll link a PR once there's one ready.

longngn commented 3 years ago

Hi, thanks @longngn. This is actually already being worked on to allow for Marlowe scalability improvements. I'll link a PR once there's one ready.

Thank James for working on this feature! Is it just serialize/deserialize Integer or any arbitrary IsData instance?

L-as commented 2 years ago

This might also fix #4168 .

Is there any update on this?

longngn commented 2 years ago

Actually I've written an Integer -> BS function for our internal contracts. Happy to write the reverse direction (BS -> Integer) and property-based testing and to create a PR

-- Convert from an integer to its text representation. Example: 123 => "123"
{-# INLINEABLE integerToBS #-}
integerToBS :: Integer -> BuiltinByteString
integerToBS x
  -- 45 is ASCII code for '-'
  | x < 0 = consByteString 45 $ integerToBS (negate x)
  -- x is single-digit
  | x `quotient` 10 == 0 = digitToBS x
  | otherwise = integerToBS (x `quotient` 10) <> digitToBS (x `remainder` 10)
  where
    digitToBS :: Integer -> BuiltinByteString
    -- 48 is ASCII code for '0'
    digitToBS d = consByteString (d + 48) emptyByteString
L-as commented 2 years ago

Thanks! Ah, you meant ASCII? I thought this issue was for getting a binary representation of it.

Benjmhart commented 2 years ago

+1 for binary representation as we're trying to enable efficient cryptographic behaviors.

michaelpj commented 2 years ago

Thanks! Ah, you meant ASCII? I thought this issue was for getting a binary representation of it.

As you point out here, there is no unique Integer -> ByteString mapping. You have to pick an encoding. So any builtin that we picked would be for a particular encoding. So we'd need to pick a particular one.

L-as commented 2 years ago

Isn't there only one obvious one? I.e. a base 256 representation.

michaelpj commented 2 years ago

No, there are many:

L-as commented 2 years ago

Little-endian is probably best because that's probably what most of us are used to dealing with.

longngn commented 2 years ago

Perhaps we can use ZigZag encoding to encode variable-length signed integer like in Plutus Core Base-256 is only possible for unsigned integer

effectfully commented 1 year ago

Describe the feature you'd like

On-chain function to convert from Integer (or IsData instance) to ByteString

We now have

  1. serialiseData :: BuiltinData -> BuiltinByteString
  2. a Show Builtins.Integer instance that you can use together with encodeUtf8

I don't know if we have any parsing capabilities.

@zliu41 do you happen to have any input on this issue?

zliu41 commented 1 year ago

Requests for adding new builtin functions should be discussed in https://github.com/cardano-foundation/CIPs.

For Plutus Tx library functions, my main concern is that converting integers to/from bytestrings without using new builtins, regardless of encoding, can be quite expensive (perhaps except serialiseData . mkI). I don't want to add a library function and make users think that they can just use it casually in their scripts (we do have the Show class, but it is intended for debugging). If we make it clear it is expensive, maybe it is OK.

effectfully commented 1 year ago

To iterate on what @michaelpj and @zliu41 said, the choices that we have are:

  1. add new builtins (tailored ones, or just deserialiseData). This should be done via https://github.com/cardano-foundation/CIPs
  2. implement encoding/decoding as regular functions by making arbitrary choices regarding the encoding. Those functions aren't going to be efficient and we're worried about having to handle complaints, optimize the functions and generally spend a lot of time on it. In particular, just looking at @longngn implementation (thank you @longngn for providing it!) from the above I can spot a number of performance issues right away: lack of tail recursion, quadratic behavior due to <> and consByteString being linear (which we probably can't avoid), x `quotient` 10 computed twice, x < 0 unnecessarily computed at each step, maybe something else -- not to mention that this direct encoding is nicely visual, but not compact at all

So given the lack of consensus and what appears to be a low priority problem, I'm going to label the issue with "Low priority".

effectfully commented 1 year ago

or just deserialiseData

Incidentally, here is some context regarding why we don't have this builtin. Credit for digging that out goes to @kwxm.