monero-project / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
8.74k stars 3.07k forks source link

[Discussion] Consider removing the tx_extra field #6668

Open tevador opened 4 years ago

tevador commented 4 years ago

First discussed here: https://github.com/monero-project/meta/issues/356

We should consider removing the tx_extra field from all non-coinbase transactions.

Main reasons:

  1. Enhanced fungibility due to a more uniform transaction format.
  2. Protection from the risks of arbitrary data on the blockchain, e.g. copyrighted material, privacy violations, politically sensitive or illegal content etc.

Required data that is currently stored in tx_extra (e.g. transaction public key) could be moved to a dedicated field.

Miner (coinbase) transactions could still allow the tx_extra field for the following reasons:

Disadvantages of removing the tx_extra field:

LocalMonero commented 1 year ago

Digital cash. I.e. a decentralized, fungible, electronic way of transferring value and storing it.

If incremental improvements present themselves we can fork.

spirobel commented 1 year ago

@UkoeHB

Transaction uniformity is not an all-or-nothing game, there are many incremental improvements that can and should be made as solutions present themselves.

How do you differentiate between sweeping things under the carpet and actually improving transaction uniformity?

The presented solutions here sound more like rug sweeping. @kayabaNerve says he does not mind using steganography to insert data into the Monero blockchain for his serai protocol.

The actual danger here is that these transactions get selected as decoys by other users and it damages their privacy. It would be much better if we went the opposite route and the serai transactions were clearly marked as serai transactions and the decoy selection algorithm of other Monero users could ignore those.

We dont know what kind of information an adversary could gather through observing these outside protocols and then seeing all these Monero transactions with steganographically hidden information inside of them as decoys.

@LocalMonero

In addition to the tx uniformity argument, one more argument in favor of removing the tx_extra field is that a dynamic (i.e. potentially infinite) block size that Monero has combined with an arbitrary field leads to Monero becoming a vector for uses that aren't money, meaning it's going to be less efficient at what its purpose is: being efficient next-gen money.

you obviously have a vested interest in not having an easy to use DEX, but even aside from that this it is just unrelated to tx_extra.

You can save arbitrary data on Monero without tx_extra if you get creative. This whole debate is just about aesthetics.

LocalMonero commented 1 year ago

Didn't think I'd encounter an ad hominem attack in a tx_extra field discussion 😅

Our interest is in Monero succeeding, with or without an easy-to-use DEX (but preferably with). In addition, there's nothing in the concept of an easy-to-use DEX that is necessarily dependent on a certain blockchain having an arbitrary data field.

You can save arbitrary data on Monero without tx_extra if you get creative.

There's a difference between constructing a tx a certain way to hide arbitrary data and having a field that invites arbitrary data. The conceptual discussion revolves around plugging those holes to the greatest possible extent.

spirobel commented 1 year ago

The conceptual discussion revolves around plugging those holes to the greatest possible extent.

please read the comment above again. The question is if we sweep things under the rug or if we actually help users select good decoys.

So some might think they are plugging holes, while it makes it harder in practice to distinguish between what is a good decoy and what is a bad decoy.

Of course it would be best and it would change the whole situation if we would not have to select decoys in the first place.

LocalMonero commented 1 year ago

Decoys are separate question. The question here is whether a field for arbitrary data should be allowed on the Monero blockchain, arguments pro and against.

spirobel commented 1 year ago

The question here is whether arbitrary data

Again. There is no way to prevent that.

Decoys are separate question.

No they are related. That is actually the real issue here. What happens if your wallet selects transactions as decoys that look random and uniform, but in reality they are special transactions made by a separate protocol like serai?

wouldn't it be better for the decoy selection algorithm of your wallet to know that these are not actually uniform and have stegnographically inserted information in them?

wouldn't it be better if they were excluded from the decoy selection process, especially if there is additional information disclosed to the public by this separate protocol?

LocalMonero commented 1 year ago

Again. There is no way to prevent that.

Removing the arbitrary data field clearly does a huge service to this end. Any further holes can be plugged to the greatest possible extent so as to render arbitrary data storage unfeasible through impracticality, strengthening fungibility and improving time and space efficiency of Monero at the same time.

No they are related. That is actually the real issue here. What happens if your wallet selects transactions as decoys that look random and uniform, but in reality they are special transactions made by a separate protocol like serai?

If the Monero protocol is constructed strictly enough to ensure uniformity then they will be indistinguishable by definition.

wouldn't it be better if they were excluded from the decoy selection process, especially if there is additional information disclosed to the public by this separate protocol?

No, it would be better if there were no distinguishable outputs in the first place. What you're proposing (excluding decoys due to them being Serai-related) is a direct threat to fungibility.

kayabaNerve commented 1 year ago

1) I almost wish for this discussion to be whitelisted re: participants.

2) We can:

This discussion needs to be refocused on these two topics.

Personally, I believe it sounds like no one's actively calling to completely remove TX extra. Accordingly, I'd actually argue to refocus to:

I only personally insist on the last one. I think TX extra being prunable, and pruned nodes pruning it, is sane, as I do statistical checks and an ASCII ban.

After these questions, all that remains is TX extra size discussions. I believe the currently open PR is a good first step, and would like to wait to further discuss size until we know how TX extra will exist as a concept.

LocalMonero commented 1 year ago

Personally, I believe it sounds like no one's actively calling to completely remove TX extra.

@kayabaNerve I'm pretty sure most people would agree that it needs to be removed, especially with Ordinals reigniting this discussion and making it clear what a liability arbitrary data can be. it is no coincidence that ordinals exploding in popularity led to the biggest Bitcoin blocks ever mined, one can only speculate how it would affect Monero with its potentially infinite block size.

Nobody is denying that the field can be used for good, such as for a DEX like yours. However, one can have a DEX without also having the downsides of the arbitrary data storage field.

@UkoeHB states:

it's a field that's literally 'for anything we can't know in advance or are unable to pass judgement on'

This is not an asset. It's a liability.

kayabaNerve commented 1 year ago

@LocalMonero I will note your objection. If we remove TX extra, it will be effectively supporting steganography, increasing the amount of outputs, not allowing pruning, and globally increasing scan time. tevador, koe, jeffro, myself, and I believe sgp and hyc are for keeping it.

I recently talked with @jeffro256 on Monero Research Lounge. It sounds like we do mutually agree on making TX-extra prunable. This would mean only including it sig/TX hashes by hash, and having pruned nodes prune it. While we can further work on this (split full nodes into full/archive, with its own discussions there*), that'd be the first step.

Since TXs would only truly hold the hash, and nodes would be able to prune it, I don't feel a need to enforce any properties on the payload itself. We can still do a statistical check however.

*It should be possible to configure a node to hold all TX extras, and it should be possible for any new node to join and sync all historic TX extras so long as any node on the network still has a copy IMO. If those conditions are met, I wouldn't mind if full nodes only kept a week's worth, and only 'archive' nodes kept everything. The issue is how do two, not directly connected, archive nodes still sync over a net of (presumably primarily) full nodes?

LocalMonero commented 1 year ago

If we remove TX extra, it will be effectively supporting steganography, increasing the amount of outputs, not allowing pruning, and globally increasing scan time.

@kayabaNerve Could you please describe exactly how you plan on encoding data into the blockchain in the absence of tx_extra?

tevador, koe, jeffro, myself, and I believe sgp and hyc are for keeping it.

I didn't see them stating that they are outright for keeping it, most of them have only directly stated to be against keeping it. I only saw them propose what additional policing needs to be done to the field if it is kept. Would @Tevador @UkoeHB @jeffro256 @SamsungGalaxyPlayer and @hyc please confirm that they are for keeping the tx_extra field?

spirobel commented 1 year ago

@kayabaNerve Could you please describe exactly how you plan on encoding data into the blockchain in the absence of tx_extra?

I would like to know too. Fake outputs already seem obvious. But how would we use CLSAGs. You mentioned earlier that there is also the possibility to save data in the CLSAGs. Are there any pitfalls with this?

kayabaNerve commented 1 year ago

@LocalMonero Fake outputs to fake keys. On the one hand, such TXs will appear like any other, other than having more than two outputs. On the other hand, they'll globally increase scan time, be un-prunable, and take up twice as much space as arbitrary data would have. That's the trade off.

@spirobel A reduction in sender privacy.

Also, I'm pretty sure my list of "for" is correct, yet if I did miscategorize anyone, I apologize.

EDIT: Upon misreading, tevador explicitly said to the contrary, later acknowledging there's a larger issue regardless. But that's definitely not for keeping it. Sorry, tevador. I do know koe is for keeping TX extra based on comments from IRC. The only other person I dragged in, explicitly, was jeffro, who I'll confirm with now

LocalMonero commented 1 year ago

@kayabaNerve could you please go into detail on how you would construct the tx and how the data would be encoded so that we can get a better idea of the time, space and cost requirements per byte of data that you wish to encode?

kayabaNerve commented 1 year ago

@LocalMonero It's been prior discussed. https://github.com/monero-project/monero/issues/6668#issuecomment-1195807365

Here's koe commenting, on this issue, against steganography: https://github.com/monero-project/monero/issues/6668#issuecomment-1195962883

I'd also note tevador, the party who frequently brings up limiting TXs to just 2 outputs (though I'm unsure if they were the original proposer of the idea), acknowledging it's not currently a good idea: https://github.com/monero-project/monero/issues/6668#issuecomment-1418051785

Without doing so, you cannot prevent steganography by outputs. Even when you do are several options available. Some damaging to Monero, some not.

Here's jeffro discussing a "good compromise", which involves keeping TX extra: https://github.com/monero-project/monero/issues/6668#issuecomment-1420920513 I've also privately confirmed they're for fixing, not removing.

LocalMonero commented 1 year ago

It's been prior discussed. https://github.com/monero-project/monero/issues/6668#issuecomment-1195807365

@kayabaNerve Thanks. Sounds like an additional arbitrary data hole that needs to be plugged. I agree that limiting to two outputs is a bad idea. I also, like @tevador, don't have ideas on how to prevent this. This doesn't mean, however, that there is no fundamental solution or that an arbitrary data field is desirable or that arbitrary data in the Monero blockchain is desirable.

As I see it, and correct me if I'm wrong, your argument is basically "Some people want to doodle on a dollar bill. You can't stop them from doodling on it, so let's dedicate some white space on the dollar bill for doodling".

My argument is that if we can prevent doodling we should to the greatest possible extent do that and make it a design goal, not cater to the doodlers.

kayabaNerve commented 1 year ago

@LocalMonero Monero has to decide what's best:

The discussion here has been making TX extra sane, limiting spam. @jeffro256 proposed a prunable solution, which I endorsed. There's an open PR to limit size already, as a relay rule. We've also discussed preventing malicious messages and ensuring message uniformity. This entire discussion here has been sane design goals to limit bs. Keeping TX extra, in some form, just ends up being the best way to limit bs.

If you have a third option, please let us know.

kayabaNerve commented 1 year ago
  • Do we want a statistical check/ASCII ban?
  • Should TX extra be hashed re: TX/sig hash?
  • Should pruned nodes keep TX extra?
  • Should full nodes keep TX extra?

Back to these, I don't see any reason TX extra shouldn't be hashed, allowing pruning, and why pruned nodes should keep it. I'd raise the further question: Do we want TX extra to be per output?

A continually raised point has been about per-output usage/per-output sizing. Moving from a TX extra to an output extra would signify that. I'd question exactly how this should be implemented, as it wouldn't be preferable if pruned TXs grew 32b per output compared to 32b per TX, but they're are plenty of discussions possible down that route.

I'd also like to question if a statistical check is still desirable when the actual TX only has a hash the Monero node itself made, guaranteeing its uniformity. I personally would say the situation is largely unchanged, yet others may disagree.

LocalMonero commented 1 year ago

@kayabaNerve Monero should not cater to those who want to put arbitrary data on the chain. I acknowledge that to a certain extent this is impossible to fully prevent, but that doesn't mean that it should be catered to or made a design goal, which is akin to the Federal Reserve leaving some blank space on dollar bills for people to doodle on because they cannot prevent people from doodling on bills. It almost encourages people to.

The "steganography" hole for arbitrary data will exist in Monero with or without the tx_extra field. Your framing of this as "either we do tx_extra or steganography" is a false dichotomy from the blockchain's point of view. The pruning angle is also a dead end as @Gingeropolous explained https://github.com/monero-project/monero/issues/6668#issuecomment-1421730675

If you're going to "steg" your way into recording arbitrary data into the chain then so be it. Hopefully as the software continues to develop this will become harder and more costly so as to eventually become impractical.

kayabaNerve commented 1 year ago

If we do not compromise with them, we accept that they'll move to a solution that multiple people view as even worse for Monero. There's also the argument it's better for Monero as it's less distinguishable. I'll also note, as I've said prior, I wouldn't actually mind being moved to steganography. It's still perfectly fine for the amount of data I'm looking to place in TXs, is less distinguishable, and actually only ~2x as much as TX extra currently is.

Ginger's comment was in response to my own comment on jeffro's solution failing the data availability problem. I later commented here: https://github.com/monero-project/monero/issues/6668#issuecomment-1423429102 my support for a prunable solution and commented how data availability would be amenably handled under it.

LocalMonero commented 1 year ago

@kayabaNerve I don't see it as worse. Since it's basically the same as a tx and you're paying for the tx fee then it's indistinguishable from the tx, which helps fungibility. We can tweak the tx format to plug these holes even further in the future. Not removing but standardizing and expanding the tx_extra field will only invite further reliance upon it with all the associated baggage, not to mention the increased complexity of the protocol due to its presence leading to worse UX and DX (with all the rules that will be applied to it that are being discussed over in this thread, some of which will almost certainly change as issues with existing rules are discovered) . Since you don't seem to mind either I think the path forward is clear and the tx_extra field should be removed.

kayabaNerve commented 1 year ago

If you still believe it's better, I will stop arguing the point with you. You are welcome to have that belief, as it does still have merit.

It's also not clear. I'm fine with either, but I prefer an explicit extra. @jeffro256 is for fixing, not removing. koe is against steganography > extra. tevador is against keeping extra last time they commented one way or another. The other commentators I'd want to hear from, for their thoughts, are sgp and hyc.

*Any TX with extra does also decrease privacy. It just only does it for its two outputs. Not the n a TX with more than 2 outputs will. It's a trade off.

LocalMonero commented 1 year ago

@kayabaNerve

It's about half as efficient by space usage

That's good, Monero should not be space-efficient for arbitrary data.

It's not prunable

And if tx_extra is prunable and through its existence encourages people to use it for their applications then you are discouraging people from running full nodes in order to avoid hosting application data, which leads to centralization. So it's a trade off, and a bad one AFAIC.

It decreases privacy* by adding junk outputs as possible decoys (given no complete membership proofs)

This is an invalid argument as this capability exists with or without the tx_extra field.

It globally increases scan time

By what percentage would you calculate the global scan time would increase?

spirobel commented 1 year ago

By what percentage would you calculate the global scan time would increase?

we would have to know how many transactions will result from this and that is something we dont know yet. Also wonder how this relates to viewtags. Could the fake outputs interfere with viewtags?

That's good, Monero should not be space-efficient for arbitrary data.

You should scroll up this thread, I think we went through this loop in the dialog tree already. It is also interesting to read that at the beginning of this thread the sentiments about this topic were a lot different and people seemed to have changed their viewpoints on this.

LocalMonero commented 1 year ago

One extra point counterpoint about the scan time, any set of rules that will be devised over the kept tx_extra field will also add scan time to check the field against those rules. The complexity of those rules will determine the additional scan time, obviously. Not to mention the increased added global UX and DX complexity associated with tx_extra being a part of the protocol.

Fundamentally the question one must answer to determine their position is as follows: Is arbitrary data desirable on the Monero blockchain? a. If yes, then tx_extra is desirable b. if no, then tx_extra should be removed

dan-da commented 1 year ago

+1 for remove tx_extra.

I zoom out a bit, and think of monero as digital cash. Cash does not have any memo field. Yet finance still exists using cash. Contracts are often/usually payable with cash. Loans can be repaid with cash, etc. Cash is plenty good for pawnshops, loan sharks, etc. They simply keep their own ledger.

I think Bitcoin muddied the waters a bit with its "scripts" and "programmable money" that has the potential to turn each tx into a contract with its own rules. And then Eth went full throttle with the idea. Both at the expense of fungibility. To me though, it seems that cash/money is a separate concern from a contract. A good money is fungible with each unit the same as every other. A good contract will refer to sums of fungible money but need not specify individual notes.

In summary, I think that tx_extra is baggage from bitcoin that never should've been included in monero and should be removed. For monero to succeed it must become and remain the most fungible money. This is one step on that path.

vtnerd commented 1 year ago

@kayabaNerve my apologies for the lazy request for information - this thread is quite long - have you written down what you intend to put in tx_extra?

FWIW, the people expressing their opinions about how tx_extra need to be removed for privacy purposes aren't necessarily considering how other projects interact with Monero. Is Monero claiming to be the do-everything chain? That stuff outside of pure fungible money is irrelevant and unnecessary (NFTs, etc)? That adaptor signatures are sufficient for all interactions? That centralized exchanges are sufficient? And mind you - I have a couple of ideas for NFTs which aren't too scummy and my humble opinion way more fascinating than the last round of "whatever" NFTs - but these ideas are years out if ever going to met.

Basically, @kayabaNerve isn't necessarily some jerk trying to hijack Monero, he represents a potential honest real-world user that could highlight where future MRL funding should go (i.e. the users requesting removal should possibly be campaigning for MRL funds for taproot+bulletproofs research to meet his demands instead of the other way around).

vtnerd commented 1 year ago

(i.e. the users requesting removal should possibly be campaigning for MRL funds for taproot+bulletproofs research to meet his demands instead of the other way around).

There's also the perspective that Monero should just have a fixed-size tx_extra which I think is the most fascinating argument, far more than removal.

vtnerd commented 1 year ago

Also @hyc @kayabaNerve @tevador is 256-bytes enough for a statistical test? It would filter out blatant uses of ascii and unicode, but possibly not much else ... or ...?

kayabaNerve commented 1 year ago

~80 bytes of structured data which will be publicly accessible to anyone in the know, yet I'll probably encrypt on chain (with a static, public key) just to be polite there.

I've made my arguments out of a legitimate belief TX extra is better than steganography. For it to be better, it must be more appealing. As a real world user, I can comment what's desired/appealing. That's how I've premised my discussions. None of it has been about what's necessarily best for me.

Active questions:

tevador commented 1 year ago

tevador is against keeping extra

As my last comment implied, I'm not strictly for removal. but I'm definitely against the current form.

If we keep tx_extra, it should be a uniformly sized field manadated by consensus (the length could be a function of the number of tx outputs, but definitely not arbitrary). 128 bytes is probably sufficient for most use cases, including a return address. It's approximately 4% of the size of a 2/2 Seraphis transaction. It could be prunable, so pruned nodes would only keep a 32-byte hash, saving 75% of space.

Additionally, I'm suggesting a relay rule that the tx_extra content must pass a quick statistical test that will fail for obviously unencrypted/plaintext messages.

One extra point counterpoint about the scan time, any set of rules that will be devised over the kept tx_extra field will also add scan time to check the field against those rules. The complexity of those rules will determine the additional scan time, obviously. Not to mention the increased added global UX and DX complexity associated with tx_extra being a part of the protocol.

Wallets don't verify any consensus rules when scanning. If you meant IBD, the consensus rule would simply mandate the length of the field, which it a cheap rule to check.

LocalMonero commented 1 year ago

If we keep tx_extra, it should be a uniformly sized field manadated by consensus (the length could be a function of the number of tx outputs, but definitely not arbitrary). 128 bytes is probably sufficient for most use cases, including a return address. It's approximately 4% of the size of a 2/2 Seraphis transaction. It could be prunable, so pruned nodes would only keep a 32-byte hash, saving 75% of space.

@tevador Why maintain all this complexity and attack surface for the sake of enabling easy arbitrary data storage on the Monero blockchain? If arbitrary data is to be stored on the blockchain don't you think it's better if it's homogeneous with and indistinguishable from any and all other data on the blockchain by being virtually identical to other txs? Just because some people have uses for it that aren't malicious? So did legacy payment IDs. Devs have been advised against utilizing the tx_extra field for years at this point.

There were people who advocated for keeping legacy payment IDs due to backwards-compatibility and the lack of a need of a shared counter despite them being detrimental to privacy and bad for UX and DX. Despite all this it was removed because it's good for Monero. Now, we cater to arbitrary data injectors?

The framing of this question as "either steg or tx_extra" is a false dichotomy. _Junk outputs can and will be exploited with or without the txextra field. It's not a "compromise" to keep the tx_extra field, it's a concession and a redefinition of Monero's design goals and principles.

Additionally, I'm suggesting a relay rule that the tx_extra content must pass a quick statistical test that will fail for obviously unencrypted/plaintext messages.

And now we have this can of worms where we people are going to debate what sort of content should be allowed or not and this side will be accusing that side of censorship while that side will be accusing this side of being unreasonable and whatnot. If the arbitrary data is steganographed into the txs this issue completely disappears.

tevador commented 1 year ago

@LocalMonero You are preaching to the choir.

There are good arguments both against and for keeping tx_extra. We can either keep repeating the arguments or try to find a solution that has a chance to get consensus and improve the current situation.

If arbitrary data is to be stored on the blockchain don't you think it's better if it's homogeneous with and indistinguishable from any and all other data on the blockchain by being virtually identical to other txs?

A mandatory 128-byte encrypted field in all transactions won't hurt tx uniformity.

It's not a "compromise" to keep the tx_extra field, it's a concession and a redefinition of Monero's design goals and principles.

The reality is that Monero has an unrestricted tx_extra field. Any compromise would be an improvement.

And now we have this can of worms where we people are going to debate what sort of content should be allowed or not

That's nothing new. Monero already "censors" transactions for uniformity purposes (see this comment).

If the arbitrary data is steganographed into the txs this issue completely disappears

There is nothing that forces "steganographed" data to be encrypted. You can easily place ASCII text into output keys.

It seems that people will always try to sneak some extra data onto the chain (see this comment), so keeping some limited field reserved for it might reduce the number of junk outputs that take up more resources to verify.

dan-da commented 1 year ago

A mandatory 128-byte encrypted field in all transactions won't hurt tx uniformity.

but it does add 128 (useless) bytes to (vast majority?) of regular "payment" tx that otherwise would not include any tx_extra data at all. no?

My gut instinct is that less data (blockchain space) would be used overall by dropping tx_extra than by having a fixed length 128byte field. show me wrong...?

Of course it may be difficult to reason about this because popularity/usage of future apps is unknowable. Still, it seems we could do everything possible to encourage such apps to store data offchain in their own ledgers. I understand that is not always easy... but then I come back to separation of concerns.

spirobel commented 1 year ago

As @j-berman pointed out in this comment statistical tests for uniformity dont work.

I don't think statistical testing for uniformity will work. It sounds trivial to fool. Simply use an encoding scheme where you XOR plaintext with a static random pad, and then decode by doing the same.

Example plaintext payload:

00000000 10101010

Static random pad:

01101000 11101001

Encoded payload that will pass a uniformity check:

01101000 01000011

@tevador

A mandatory 128-byte encrypted field in all transactions won't hurt tx uniformity.

The right place to do this would be in the wallets. The easiest way to achieve this goal is to increase the length of the paymentId to 128bytes. The code to generate transactions already creates dummy paymentIds for every transaction to increase uniformity (every transaction looks like a transaction to an integrated address)

It seems that people will always try to sneak some extra data onto the chain (https://github.com/monero-project/monero/issues/6668#issuecomment-1418008913), so keeping some limited field reserved for it might reduce the number of junk outputs that take up more resources to verify.

but that only works if there is a reasonable expectation that this field wont be removed, altered or pruned in the future.

To avoid fatal mistakes, consensus and relay rules should only be added if they meet these requirements:

  1. they should have a clear goal
  2. they should achieve that goal
  3. they should be as simple as possible

If rules are added in the heat of the moment as a reaction to things without clearly considering this, people will trust the protocol and its continuity less and less.

Neither the tx_extra size limit on the relays nor the statistical uniformity checks pass these requirements.

It seemed like an adhoc reaction to ordinals that was well meant, but after more consideration can not achieve its objective.

tevador commented 1 year ago

@dan-da @spirobel All of this has already been discussed above. We are just cycling through the same arguments over and over.

The right place to do this would be in the wallets. The easiest way to achieve this goal is to increase the length of the paymentId to 128bytes

Seraphis has no payment ID, so it must be a separate field.

vtnerd commented 1 year ago

Thanks @kayabaNerve I just read the entire discussion thread, as I should've from the beginning. My thinking is most similar to @tevador in the end. I will have some differing thoughts as this nears some kind of consensus, but I see my thoughts most reflected in @tevador .

~80 bytes of structured data which will be publicly accessible to anyone in the know, yet I'll probably encrypt on chain (with a static, public key) just to be polite there.

You deflected here, please provide what is actually being stored. You have stated that you are willing to record information on the Monero blockchain that links certain transactions to your DEX system, and you are willing to use the ZKP system to achieve your goals. In either case (tx_extra or "stenography"), chain analysis companies are able to identify these transactions and link them to this other chain, which (by your claims) isn't providing any meaningful privacy either.

The ideal (based on Monero community ethos) would be a system where the transaction wasn't guaranteed to be 100% linkable to your DEX. And I'm with @tevador in that I'm skeptical that the same thing cannot be achieved with only hashes.

spirobel commented 1 year ago

@tevador

Seraphis has no payment ID, so it must be a separate field.

I was speaking about the current situation. The relay rule for the tx_extra size limit that you want to introduce was also for the current protocol, right?

@vtnerd

There's also the perspective that Monero should just have a fixed-size tx_extra which I think is the most fascinating argument, far more than removal.

I have a valid usecase for this kind of encrypted data field that even enhances user privacy instead of diminishing it:

I wrote a browser wallet for Monero that interacts with a website via calls to a standardized rest api endpoint at this respective website). Please take a look at this demo video to see how it works (a transaction on the monero stagenet is made in the video): https://youtu.be/4DLcsQ45zoE?t=132

This is how the wallet code and the demo backend (localhost:3006 in the demo video) that I wrote works:

  1. the user wants to buy a product (like access to a chat group or a private RSS feed)
  2. the browser wallet sends a tx proof to the website (at a standardized rest endpoint) directly after making the transaction. (so the backend does not even need to have access to the viewkeys of the merchant. Which enables the possibility of non custodial marketplace websites)
  3. Afterwards the wallet can automatically relogin into the website via the spendproof of the txid. This means no emails, usernames or passwords are necessary anymore. Authentication is possible with just the txid and there is no need to ever copy addresses.

It would be great to recover all the information just from the wallet seed. So I would like to save the website url in which context the transaction was made. This way if the user recovers from seed he wont risk disclosing himself with the wrong txid at the wrong website. (also all the login information neatly syncs between devices)

The Zcash encrypted memo field is 512 bytes by the way: https://zcash.readthedocs.io/en/latest/rtd_pages/memos.html they pad with zeros if it is not used. (before encryption)

tevador commented 1 year ago

I was speaking about the current situation. The relay rule for the tx_extra size limit that you want to introduce was also for the current protocol, right?

Tx_extra currently cannot be easily removed because it contains data required to recognize payments (DH public keys and payment IDs). Seraphis obsoletes payment IDs and moves the DH keys elsewhere, which could facilitate the removal of tx_extra. The tx_extra size limit proposal is a stopgap solution before Seraphis.

spirobel commented 1 year ago

Seraphis obsoletes payment IDs and moves the DH keys elsewhere, which could facilitate the removal of tx_extra.

have read my above post and reviewed the use case that I presented? Do you think it is valid?

kayabaNerve commented 1 year ago

@vtnerd Sorry, I didn't mean to deflect. I thought that was the relevant information and didn't want to bloat the discussion with Serai-domain-specific commentary.

Specifically, the "instruction". The instruction specifies what to do with the received coins. For an output of 5 XMR into Serai, that may be a 32 byte Serai address to receive sriXMR, or it may be an instruction to swap to BTC and send to a BTC address.

Serai is actually out-of-scope for any privacy discussion because it's publishing its view keys and all TXs out are in plaintext on the Serai chain. My personal policy is its TXs should appear like any other, minus the additional data we throw in (either via TX extra or additional outputs), but the fact we publish view keys means it'll never actually be private.

This has implications as effective decoy poisioning which does justify discussions on complete membership proofs, yet those are out of scope to this conversation and I'd rather not bring it up again.

The ideal (based on Monero community ethos) would be a system where the transaction wasn't guaranteed to be 100% linkable to your DEX.

I explicitly disagree here. I believe the point of decentralized technology is to be trustless. In order to be trustless, it must be verifiable. When 100 sriXMR come into existence, the only way to verify that is to see 100 XMR on chain. That requires revealing the Monero output.

... theoretically ideally? Sure. We'd be able to verifiably say we received 100 XMR without saying in which TX. Practically ideally? It's impossible.

And I'm with @tevador in that I'm skeptical that the same thing cannot be achieved with only hashes.

The issue with hashes isn't any commentary on correctness/verifiability. It's about mandating I build an entire data pipeline, and spam preventative measures, to handle it. If Monero only gives me a hash, users cannot submit the data on-chain due to on-chain TXs requiring fees and these users presumably not having any SRI. Just XMR they sent to Serai to swap.

So they could submit to the validators who use a zero-fee transaction! I now have to make the validators directly publicly accessible, and directly handle connections from every Monero user to receive and validate their data. Then I need to duplicate that data to ensure it's verifiable over the entire amount of time its relevant (the shorter of Monero/Serai's existence).

That adds network architecture issues on my end and DoS concerns. While it'd be an improvement to the system, sure, if I could spare the bandwidth to build that entire pipeline, I can't justify it right now when I can trivially put the data on Monero instead. I am available to be attacked for taking the easy way out. I am also available to tell you most developers wanting to use TX extra will also take the easy way out. I'd also note this is only the easy way out because Serai's design has a secondary synchronized database which I'm simply not choosing to extend/use. For most designs, it's not the "easy way out". The entire point is using X for data storage (see Bitcoin ordinals not simply referring to IPFS URLs).

I'd also note this is again going in circles, which I share great frustration in. I responded here to ensure vtnerd, who has my utmost respect, is caught up. The only other comment I'd chime in on is @j-berman's about using a static pad to 'cheat' the uniformity check. That 'cheats' the uniformity check by encrypting your data to a uniform string of data only actually accessible to those with the decryption key. Congrats. That's the point :p

Gonbatfire commented 1 year ago

"either steg or tx_extra" is a false dichotomy. Junk outputs can and will be exploited with or without the tx_extra field.

@LocalMonero Yes, but without tx extra, users will be incentivized to generate even more junk ouputs.

You can't prevent people from storing arbitrary data on Monero, so at least we should incentivize them to do it in the least harmful way.

LocalMonero commented 1 year ago

@Gonbatfire quite the opposite, you are disincentivizing it by making it more costly. In addition, junk outputs are an exploit that also need to be fixed, but this is beyond the scope of this issue.

Fundamentally, Monero is a project with one mission: being digital cash. UNIX philosophy applies: do one thing and be good at it. Branching out into arbitrary data storage is not only beyond the scope of Monero but also hurts the efficiency of its purpose.

If you want to have some sort of a Monero/Ethereum hybrid you're welcome to do what @fluffypony did with Tari. DEXs are possible with or without tx_extra: in addition to Haveno existing, even @kayabaNerve admits that it's not a requirement for his DEX but merely a preference.

Gonbatfire commented 1 year ago

disincentivizing it by making it more costly

@LocalMonero It will be more costly but it also will make data storage (that's gonna happen anyway) more harmful to users privacy, it's not worth it.

I agree Monero's focus is private digital cash, but by removing tx_extra I believe we are harming user privacy, since it's increasing the incentive to create MORE garbage outputs, garbage outputs are a problem right now, until we find a way to fix that, let's not make it worse.

I'm not a fan of storing jpegs on scarce blockspace that has the potential to be the foundation for humanity, but we either learn to live with it or we start playing an unwinnable Whac-A-Mole game that will only result in negative side-effects for users.

I'm afraid this may become like the war on drugs, the war does more damage than the drugs themselves.

LocalMonero commented 1 year ago

@Gonbatfire if it's more costly to store arbitrary data on the chain then less of it will be stored. If junk outputs are a threat to user privacy then just like any other threat to user privacy (like legacy/integrated payment IDs vs subaddresses) they need to be patched as soon as possible, even if that means breaking backwards compatibility. The "junk outputs harm privacy" argument is something that everyone can agree on and that it needs to be patched I also hope everyone can agree on.

The argument "given that we have a privacy exploit that, as a side effect, enables arbitrary data storage we should make arbitrary data storage easier" makes no sense to me.

The argument "while we have this privacy exploit that, as a side effect, enables arbitrary data storage let's enable another privacy attack vector that specializes on easy arbitrary data storage, otherwise a well-intentioned dev may exploit user privacy" makes no sense to me either.

Well-intentioned devs developed applications around payment IDs since they don't require a shared counter unlike subaddresses, and I think we all agreed that this hole needs to be plugged. Even LocalMonero had to switch (subaddresses didn't exist back when LocalMonero started), and it was a costly switch, but we did it because it's best for user privacy, despite the fact that it was easier for us to just keep things as is.

As a side note, perhaps well-intentioned devs shouldn't be exploiting user privacy, especially given that it is not a necessity? So, if well-intentioned devs shouldn't be exploiting privacy holes, who does that leave us with but malicious actors and those who are ignorant (and can they really be that ignorant if they understand such advanced exploits)?

but we either learn to live with it or we start playing an unwinnable Whac-A-Mole game that will only result in negative side-effects for users.

As @UkoeHB said, it's not an all-or-nothing game. It's incremental. Plug this hole, plug that hole, keep moving in the right direction. And keeping tx_extra post-Seraphis is a step in the wrong direction.

kayabaNerve commented 1 year ago

even @kayabaNerve admits that it's not a requirement for his DEX but merely a preference.

Without TX extra, I'm embedding data in outputs. It's a trivial solution solving the data availability problem without a month or two of my own architecting. While you can comment it's not a requirement, and we can discuss the theory of that, I've been clear my personal advocacy is for a practical solution which is optimal to all parties.

No matter what, I have a solution. Output encoding is only minimally more invasive on my end. To Monero, it's globally increasing scan time, poisoning decoys, and using about twice as much space. The only benefit is some argument about uniformity which doesn't legitimately stand when most TXs are 2-out.

Sure, that arguably makes me malicious, as I try to comment on the practicality of this. I'm also here arguing to never create that future. I'd also, personally, note that if I was forced between damaging steganography (a question of scale) and not having a widely used DEX, I'd question which is more damaging and pick whichever is better. That may mean withdrawing from Monero, or it may mean reducing Monero's effective privacy (which could be solved with other methods). Either way, as long I do my best to make Monero better overall, I don't believe I can be argued as malicious. Just misinformed/misevaluating at worst.

I'd also hope to be able to commit development resources to avoid this problem entirely in the future, yet I can't right now and won't make assumptions on that premise. I also don't assume the next person who walks in can nor that they have a design which even offers a path reachable with more resources. There are some designs which will fundamentally want to use Monero for its data. While I don't support anyone adding KBs, I don't personally mind bytes. Finally, I'll reiterate I haven't premised any of my arguments on my project directly. Just with the perspective I have thanks to my project.

The argument "given that we have a privacy exploit that, as a side effect, enables arbitrary data storage we should make arbitrary data storage easier" makes no sense to me.

Monero will always have arbitrary data storage as it's a data protocol. While we attach the intent of currency to that data, that doesn't change it's a data protocol. The fact Monero communicates arbitrary data is a natural side effect which can never be stopped. The only question is where we go from there.

LocalMonero commented 1 year ago

Without TX extra, I'm embedding data in outputs.

For a certain period of time you are. And when that exploit's patched you'll have to find a new one that's probably even less cost-efficient, which, I assume, you will also report to the community (or, hopefully, you'll figure out a design that doesn't require you to store anything on the chain). It's a process.

I've been clear my personal advocacy is for a practical solution which is optimal to all parties.

Monero should not offer practical solutions for arbitrary data injectors. There are plenty of other projects that do.

No matter what, I have a solution. Output encoding is only minimally more invasive on my end. To Monero, it's globally increasing scan time, poisoning decoys, and using about twice as much space.

For how many transactions? 0.01% out of the whole network? If someone who actually wants to harm Monero's privacy, scan time, and poison decoys they will attempt to make these txs a large portion of the overall network (which will certainly trigger an urgent response from the dev community), so your usage of it doesn't seem to be that big of a problem prior to this being patched. I'm much more concerned with actual malicious actors exploiting this.

The only benefit is some argument about uniformity which doesn't legitimately stand when most TXs are 2-out.

And tx_extra with arbitrary data doesn't hurt uniformity even more? And if it's encrypted or otherwise uniform in nature and every transaction will have to have it be fully filled, that doesn't hurt space requirements? And that doesn't make transaction costs higher for all users globally, regardless of whether they are utilizing tx_extra or not, making not utilizing tx_extra be unproductive for the fee they're paying?

Sure, that arguably makes me malicious

I didn't say you were malicious. I was talking about the fact that the hostile exploiters of the junk outputs are the actual problem, as well-intentioned devs will strive to avoid harming user privacy. This is why the junk outputs are beyond the scope of this discussion and are a separate issue.

I'm also here arguing to never create that future.

"Hey man, listen, I got these boxes of my stuff that I want to store in your bank. I know your bank doesn't allow random people to store boxes of stuff here but listen man if you don't let me in through the door I'll break through your window but I don't want to create that future so how about you designate a key card for me, alright? It's for your own good."

Sounds like the solution is putting bars on the window.

I'd also, personally, note that if I was forced between damaging steganography (a question of scale) and not having a widely used DEX, I'd question which is more damaging and pick whichever is better.

Thank goodness that in reality having a widely-used DEX is not predicated on that 😅

I'd also hope to be able to commit development resources to avoid this problem entirely in the future, yet I can't right now and won't make assumptions on that premise. I also don't assume the next person who walks in can nor that they have a design which even offers a path reachable with more resources. There are some designs which will fundamentally want to use Monero for its data. While I don't support anyone adding KBs, I don't personally mind bytes. Finally, I'll reiterate I haven't premised any of my arguments on my project directly. Just with the perspective I have thanks to my project.

If your or the next person's design necessarily requires arbitrary data injection into the Monero chain to work they are welcome to use another project or deal with the fact that even if they find a way to inject data it's going to be as inefficient as possible.

Monero will always have arbitrary data storage as it's a data protocol. While we attach the intent of currency to that data, that doesn't change it's a data protocol. The fact Monero communicates arbitrary data is a natural side effect which can never be stopped. The only question is where we go from there.

It's true that someone can take the "Date of Birth" field on some website and use it to encode a certain amount of bytes, and perhaps through encoding this way over a long enough chain of accounts encode the entire Bee movie into that website's database. Does this mean that this website should cater to anyone who wants to store movies on it?

Data constraints work. Yes, they don't (and maybe can't) work 100%, but defining the transaction structure as strictly as possible will make arbitrary data storage as costly and inefficient as possible, disincentivizing arbitrary data storage to the greatest possible extent which in turn makes the Monero chain to the greatest possible extent void of arbitrary data and optimized for transactions. In addition, this will also ensure maximum uniformity and by extension privacy and fungibility. The opposite is true for making arbitrary data storage easier: Monero will be worse at what it should be best.

I remember similar arguments being made in favor of giving in to ASICs, which were "inevitable" until @SChernykh and @hyc brought us RandomX. Monero stuck to its core and didn't give in, and thank goodness for that.

TheCodeingPadawan commented 1 year ago

TL;DR: In summary, A developer utilising Monero payment should implement accounting and metadata on their own systems to track what they need to track. Monero should always just be the private and fungible medium of exchange.

Wall of Text version: Arbitrary data should be stored on the side. Any logic or arbitrary data should always be separate from the cash itself. You can consume and dispense Monero as required within a private platform, but those inputs and outputs should always just remain as fungible Monero.

The only way to fix this would be by adding fixed arbitrary data bytes to each transaction, and force encryption to keep everything on the public chain indistinguishably fungible and private between the two transacting parties. But this will consume valuable space on every transaction, when practically all transactions do not use the TX extra field. Those that do would also likely consume little of the dedicated space, as it would need buffer room to give it more utility.

It just seems like wasting blockchain space for the fraction of a fraction of a percent that have a use case, for negative to little benefit to the overall project.

kayabaNerve commented 1 year ago

For now you are

I'm not.

And tx_extra with arbitrary data doesn't hurt uniformity even more?

It explicitly depends on the over-arching scheme. I wouldn't immediately say yes.

And if it's encrypted or otherwise uniform in nature and every transaction will have to have it be fully filled, that doesn't hurt space requirements? And that doesn't make transaction costs higher for all users globally, regardless of whether they are utilizing tx_extra or not, making not utilizing tx_extra be unproductive for the fee they're paying?

Encryption would not hurt space requirements. All TXs having a default sized blob is a distinct discussion I'm personally not in favor of.

TL;DR:

This fails to be a proper TL;DR of the many arguments at stake. I do not feel a need to continue running down these nits. I'll withdraw from this discussion until there's a legitimately new argument, or a conversation of sufficient weight begins. Right now, this is just back and forth bickering which is productive to none of us.

LocalMonero commented 1 year ago

I'm not.

The "now" I meant was in the post-tx_extra-removal time, I was answering in the context of "without the tx_extra I will", as I quoted.

Encryption would not hurt space requirements. All TXs having a default sized blob is a distinct discussion I'm personally not in favor of.

If the tx_extra field isn't uniform then it's a massive privacy/fungibility issue. If it is then it would have to be fixed length and encrypted, permanently increasing space requirements for txs globally, forcing people to store data and pay fees whether they're utilizing the field or not.