anza-xyz / solana-pay

A new standard for decentralized payments.
https://solanapay.com
Apache License 2.0
1.33k stars 467 forks source link

[rfc] Spec Proposal: Transaction Request (formerly Request Link) #26

Closed jordaaash closed 2 years ago

jordaaash commented 2 years ago

Update

This proposal has changed. A specification and implementation now exists in #77

Abstract

This is a new proposal for an extension to the Solana Pay specification.

This proposal draws in part from https://en.bitcoin.it/wiki/BIP_0072, relying on HTTPS for transmitting and authenticating arbitrary transaction payloads.

Motivation

There are a some significant shortcomings of the simple BIP21-based payment link scheme described in the Solana Pay specification.

1. It only describes simple native SOL and SPL token transfers.

Merchants, service providers, and apps may wish to mint NFTs or transfer reward tokens with purchases, invoke programs, pay gas for customer transactions, and enable many other use cases that may be developed with arbitrary transactions.

Transactions on Solana must specify the accounts that will be included in the transaction upfront. Most useful instructions require the wallet signer address, their auxiliary token account address, etc.

This requires knowing the wallet address and being able to generate PDAs from it, which cannot be known when the link is created. In short, it's missing a sessionless "connect wallet" function.

2. Payment requests are not authenticated.

We may expect that payment links will be maliciously or accidentally misused. Without knowing who a receiving address belongs to, it's not possible to determine from the URL who is requesting the payment.

A mechanism such as an HTTPS link allows the wallet to authenticate the source of the request. There may be other mechanisms we should consider.

Proposal

I propose to add an optional request=<url-encoded-url> query parameter to the specification.

An interactive protocol between the merchant and wallet follows:

  1. The merchant presents a QR code with the following payment link:
    solana:<recipient>?request=https%3A%2F%2Fmerchant.com%2Fsolanapay

    Any of the parameters of the spec can also be included. <recipient> could be considered optional. Perhaps an invalid address (e.g. a, x, or _) could be provided for compatibility.

Regardless, it must not be used by the wallet if request is provided, so providing an invalid recipient address ensures a wallet will not prompt a user to make a payment unless it can handle the request parameter.

  1. The customer scans the QR code and opens their wallet app.

  2. The wallet parses link and prompts the user to make a request to https://merchant.com/solanapay.

This is analogous to connecting a wallet to a dapp, so obtaining the user's permission is important for privacy.

  1. If permitted, the wallet makes a request to

    https://merchant.com/solanapay?from=<wallet>&<...params>

    The wallet should include any parameters from the URL provided, except for the request parameter.

  2. The merchant responds with a JSON object:

    {"transaction":"<transaction>"}

    The transaction property value must be a base64-encoded serialized transaction.

The feePayer, recentBlockhash, nonceInfo, and signatures fields are optional but may be included. If they are included, the wallet must use them in the final transaction, since the transaction may be partially signed and subsidized by the merchant.

The wallet should allow additional fields in the JSON object, which may be added by future specifications.

  1. The wallet deserializes the transaction, simulates it, and presents it for signing.

The wallet may wish to display the domain the request came from, and may wish to show payment requests not including a request parameter as unauthenticated.

  1. The user signs the transaction and the wallet sends and confirms it.

  2. The merchant discovers the transaction through the reference parameter, if provided.

jordaaash commented 2 years ago

Breaking Changes

When using the request field, the transaction can be anything -- it doesn't even need to be a payment. It could be a transaction to receive a gift or invitation from the merchant for scanning a wallet. Even if it is a payment, the receiving address could be dynamic, or a smart contract, etc.

This means that semantically, there may be no recipient to provide. I think we should allow URLs with no recipient of the form

solana:?request=<url>

This is shorter (will encode in a lower-density QR code) and is more correct than providing a value that shouldn't be used and hoping it isn't.

BIP72 shows this as a valid, non-backward compatible example, but we don't really need to worry about backwards compatibility at this time.

This is a breaking change because the spec currently requires the recipient field value to be a public key, and I propose that we allow an empty string.

jordaaash commented 2 years ago

Notes

Some cool use cases this unlocks:

  1. Merchants get an atomic bidirectional communication channel with customers. They can mint an NFT or transfer loyalty reward tokens in the transaction.
  2. Merchants could potentially see what tokens a user has, accepting and denominating payment in any of them.
  3. Merchants can pay for transactions on their user's behalf so they don't need SOL in a wallet.
  4. Merchants can return an error from the server to decline to respond with a transaction. This could be used to allow permissioned payments.
  5. Payments can be directed to escrow-like programs, enabling things like refunds, chargebacks, and other return mechanisms.
  6. DeFi transactions could be bridged to all kinds of web2 / IRL portals.
  7. Wallets can retrieve other information, or merchants can pass it to them, like an icon to display, or other fields in the JSON response.
  8. It doesn't even need to be a payment. Merchants could send tokens, invitations, gifts to customers that connect a wallet, perhaps one that meets some criteria, such as possessing an NFT.

Some things I'm unsure about:

  1. Do we need to do anything to ensure message integrity?
  2. If merchants are co-signing the transaction, how do we prevent spam / DoS?
  3. How can custodial exchange wallets adapt this functionality?
  4. How do we keep users safe when arbitrary transactions are supported?
joncinque commented 2 years ago

This all looks really good. Some questions regarding your questions:

  1. Do we need to do anything to ensure message integrity?

How is the merchant certain that the transaction is correct afterwards? If the user just gets a plain unsigned transaction, their wallet could manipulate all sorts of things inside of it. Will the merchant hold onto a hash of the message somewhere and check for that? Note that if the merchant signs the transaction, then this isn't a consideration since it'll get rejected by the chain.

On the subject of signatures, if the merchant signs the transaction, then the transaction must be sent within the blockhash's lifetime, roughly 2 minutes, which may be short for some use-cases, but certainly not a deal-breaker.

  1. If merchants are co-signing the transaction, how do we prevent spam / DoS?

This could be provided at the level of the merchant API. Before signing anything, they can do as many checks as they want on their server to be sure that they should sign this message. For example, they can check the recipients / senders, number of transactions currently outstanding, etc. They can even add protections to the API that generates the payment link, so that they avoid getting their wallet drained.

  1. How do we keep users safe when arbitrary transactions are supported?

This is a bigger question not specific to this proposal. Anytime a transaction is created for you, there's risk. In a perfect world, the wallet would mediate all of these things to avoid exposing you to risk. For example, a dapp instruction should almost never require a signature from your main wallet, so the wallet can check if any instructions name your wallet as a signer. You should be able to do everything through ephemeral accounts and approves on SPL tokens. The wallet can also add balance checks to the transaction, as discussed elsewhere.

jordaaash commented 2 years ago

Thanks so much for the swift review @joncinque!

How is the merchant certain that the transaction is correct afterwards? [...] Will the merchant hold onto a hash of the message somewhere and check for that?

I think so, yeah. Right now, merchants validate transactions once they discover them to verify that they contain a valid transfer of the expected amount to their address. This is a little complicated even for the simple case. I think for arbitrary transactions, they may want to just store a message hash and check it more cheaply.

if the merchant signs the transaction, then the transaction must be sent within the blockhash's lifetime

I think for the use case this feature makes sense in the context of, this is totally fine. Even a minute is a long time to wait to confirm the transaction in terms of UX, so I think the emphasis will need to be on making sure this happens quickly and reliably. Nonce accounts are something I don't know much about. Is there a way a merchant could retry a transaction with a nonce account?

Before signing anything, they can do as many checks as they want on their server to be sure that they should sign this message.

Yeah, good points. I think if the merchant has a trusted PoS that is getting permission to produce unique request URLs from a trusted service, say with a secret auth token, they can make sure they aren't getting spammed at scale, and are maybe just eating the occasional failed tx fee. This can ideally be left up to service providers to innovate on their behalf.

One thing is that the PoS app or ecommerce app will need to become aware of the transaction somehow. I can think of basically two ways --

  1. The merchant app encodes solana:ABC?request=https://merchant.com/solanapay?reference=XYZ and polls for signatures on pubkey XYZ as it does with the current protocol. The wallet makes a request to the request link URL, which returns a transaction with some instruction referencing XYZ.
  2. The merchant app has, say, a socket connection to the merchant API. The merchant app encodes solana:ABC?request=https://merchant.com/solanapay/UNQ where UNQ is just some opaque unique identifier for the request. The merchant API signs the tx as the fee payer, so it has the primary tx signature now. It transmits this signature to the merchant app, and now the merchant app can just reliably wait for confirmations on that tx without any reference.

My general expectation is that merchants will usually want to pay tx fees for users, and 2) produces a cleaner result / UX.

How do we keep users safe when arbitrary transactions are supported? This is a bigger question not specific to this proposal.

True. I just want to get feedback to make sure this protocol isn't adding novel attack vectors I haven't thought of.

jordaaash commented 2 years ago
  1. How can custodial exchange wallets adapt this functionality?

This is potentially a can of worms -- if you're using your FTX wallet that's doing withdraws via the current version of Solana Pay, you don't have the same freedom to sign arbitrary txs that a Phantom user does.

I don't know, for example, if the merchant API will be able to see your balances. And if they want to do something like send you an NFT or novel SPL token, you're in trouble because there's a good chance the custodial exchange can't receive it.

arkitoure commented 2 years ago

This is great.

We are currently working through a lot of what you have laid out here via Magento <-> Solana Pay integration.

Purchased products mint NFTs with metadata attached. Specifically for us, in a metaverse space for cross-reality interoperability. If not 'within' the transaction, would be in parallel.

  1. Payments can be directed to escrow-like programs, enabling things like refunds, chargebacks, and other return mechanisms.

This is an important point due to individual laws regarding 'right to withdraw' in whichever respective country applied. In the EU its 14 days, we could in theory hold funds in escrow for this time then push to another account after the cooling off period is over. But would be nice to have something baked into the native transaction.

Perhaps we, myself and @molotovbliss, add to this conversation in the coming days see what you think.

jordaaash commented 2 years ago

@arkitoure I'd love to talk more about your Magento integration, I think this is something that could be instrumental to the ecosystem. What's a good way to contact you? Or DM me on Twitter if you'd like 🙂

joncinque commented 2 years ago

Is there a way a merchant could retry a transaction with a nonce account?

The better bet is probably just to retry with a fresh blockhash. Nonces are going to be more trouble than they're worth.

My general expectation is that merchants will usually want to pay tx fees for users, and 2) produces a cleaner result / UX.

That was my thinking too, and makes the implementation cleaner!

owenkellogg commented 2 years ago

This is super important! Request URLs are becoming a standard among payment systems like Bitpay and Anypay. They solve many problems and should be seriously adopted by SolanaPay.

One concept that might not have been addressed but which offers immense power and flexibility payments with multiple recipients. With Payment Request URLs the wallet can download a structure specifying multiple amounts to different parties, unlocking huge potential for creative payment apps that aren't simply "send one amount from one person to another". Imagine a payment at a store where the clerk is paid, the owner is paid, the banker is paid and the consignment supplier is paid all in one swipe!

Already many wallet apps have adopted the payment request URL protocol. I will follow up in a bit to make sure this powerful capability becomes a stable in Solanapay.

jordaaash commented 2 years ago

Imagine a payment at a store where the clerk is paid, the owner is paid, the banker is paid and the consignment supplier is paid all in one swipe!

This is a really cool application of this!

Do you have any thoughts on how, in general, custodial wallets might implement this? Their use of different withdrawal and deposit addresses and sweeping funds into cold wallets makes something flexible like this harder for them to handle.

Separately, it could also be useful to allow wallets to send additional instructions to include in the transaction. For example, a wallet can protect users by appending an instruction to assert their balance didn't change by more than an expected amount, but this wouldn't be possible if the merchant's request URL API signed the tx without including that instruction.

Is this something we should consider in scope or leave to future specification?

owenkellogg commented 2 years ago

For reference and compatibility with other systems it should probably conform to the existing JSON v2 Payment Protocol documented here:

https://bitpay.com/docs/payment-protocol

Steven Zeiler

------- Original Message ------- On Saturday, February 5th, 2022 at 01:30, Jordan Sexton @.***> wrote:

Imagine a payment at a store where the clerk is paid, the owner is paid, the banker is paid and the consignment supplier is paid all in one swipe!

This is a really cool application of this!

Do you have any thoughts on how, in general, custodial wallets might implement this? Their use of different withdrawal and deposit addresses and sweeping funds into cold wallets makes something flexible like this harder for them to handle.

Separately, it could also be useful to allow wallets to send additional instructions to include in the transaction. For example, a wallet can protect users by appending an instruction to assert their balance didn't change by more than an expected amount, but this wouldn't be possible if the merchant's request URL API signed the tx.

Is this something we should consider in scope or leave to future specification?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

owenkellogg commented 2 years ago

The merchant might not even need to sign the transaction. However it is extremely useful for speed to for the wallet to transmit the signed transaction directly to the merchant's API so the merchant does not have to monitor the entire network for the transaction. Sent p2p in that fashion we have been able to drastically reduce payment times in-store at Anypay (for instance here https://twitter.com/AnypayX/status/1362949033095008261). Direct transmission between wallet and merchant provider is key, and part of the existing JSON Payment Protocol spec

jordaaash commented 2 years ago

For reference and compatibility with other systems it should probably conform to the existing JSON v2 Payment Protocol documented here: https://bitpay.com/docs/payment-protocol

Hmm, this is a lot of round trips. And also, the intent here is to return a transaction payload rather than just payment requests. Which aspects of the protocol are you thinking it should conform to?

However it is extremely useful for speed to for the wallet to transmit the signed transaction directly to the merchant's API so the merchant does not have to monitor the entire network for the transaction.

This is a good point and something we should test on Solana. Having the wallet broadcast the tx has some advantages for the UX though -- the user receives immediate feedback with confirmations, and the wallet controls how this is presented.

Let me read through this payment protocol more thoroughly!

jordaaash commented 2 years ago

Transaction Requests specified and implemented in #77

PaulFidika commented 2 years ago

The most obvious thing this spec is missing is a redirect_uri.

When the merchant responds with a JSON object, it should include an additional field:

{ transaction: <transaction>,
redirect_uri: <uri> }

After the transaction is signed and submitted by the wallet, the wallet should automatically forward to the redirect_uri, or have a button that says 'continue' that forwards to the redirect_uri. The redirect_uri should also have a txsig= query parm added, which the merchant could use to check the status of the transaction--although in my opinion the merchant should instead keep polling an RPC node looking for its transaction. Polling is a slower and more expensive method, but more reliable because this redirect cannot be guaranteed (the user could confirm a transaction and then exit out before the wallet redirects them).

If the user rejects the transaction, we can redirect the user back to the redirect_uri along with a txsig="fail" or some more specific error message, to let the merchant know their transaction request was declined or failed for some reason.

The advantage of this is that payment flows originating from within apps will be simpler. I can simply push 'pay with Solana' within any iOS native app, be directed to my wallet app, confirm the transaction, and then be redirected back to the success screen inside of the app. Or we can do the same thing with any web-app being viewed in a mobile browser.

jordaaash commented 2 years ago

Thanks @PaulFidika. This is a good point and it's something that I've been thinking about and talking with mobile wallets about. However, transaction requests aren't the only thing redirect URLs or deep links are applicable to, and there are many ways wallets might want to implement them, so right now it's been evaluated in parallel rather than blocking this feature.

PaulFidika commented 2 years ago

I was recently looking into Wallet Connect (I wasn't familiar with it until yesterday). Why doesn't Solana just use Wallet Connect, or at least build a standard similar to it? I feel like Solana Pay is somewhat re-inventing the wheel here. I'm just wondering if there's some sort of advantage to Solana Pay over wallet-connect, or something you dislike about Wallet Connect?

jordaaash commented 2 years ago

Solana Pay is based on BIP 21, and this feature is based on BIP 70/72. QR codes are just one way that Solana Pay URLs may be encoded. URLs can be used with NFC, messaging protocols, app to app communication, etc.

WalletConnect 2 is a more generic chain-agnostic protocol and requires the use of relay servers. No wallets on Solana supported it (none do now, except Steak Wallet) and the spec and implementation of WC2 are incomplete.

There are issues with complexity, latency, and reliability. We considered it but chose to work with wallets to implement something simpler after evaluating its limitations with them.

However, there is WalletConnect support being added to the wallet-adapter library now. This won't supersede Solana Pay for the use cases we are focusing on though.

PaulFidika commented 2 years ago

WalletConnect 2 is a more generic chain-agnostic protocol and requires the use of relay servers. No wallets on Solana supported it (none do now, except Steak Wallet) and the spec and implementation of WC2 are incomplete.

There are issues with complexity, latency, and reliability. We considered it but chose to work with wallets to implement something simpler after evaluating its limitations with them.

Okay interesting. Why does Wallet Connect use relay servers? Why not simply have the dapp <-> wallet communicate with each other directly using some standard format? I don't see the advantage of a relay server.

However, there is WalletConnect support being added to the wallet-adapter library now. This won't supersede Solana Pay for the use cases we are focusing on though.

Okay great! Will Solana wallets be adding support for wallet-connect anytime soon?

jordaaash commented 2 years ago

These are good questions but they aren't really in scope for this issue. I'm happy to discuss elsewhere though! I'm jordaaash#3040 on Discord, and there's a solana-pay channel there.

jordaaash commented 2 years ago

Closed by #77