ArweaveTeam / arweave-standards

Arweave standards, specifications and best practices
66 stars 45 forks source link

Data Purging Proposal #8

Open jonathanstanley opened 3 years ago

jonathanstanley commented 3 years ago

Abstract

This document describes a method to purge data from the Arweave network while retaining information permanence and censorship resistance.

Motivation

Exclusive power to delete / remove / purge / destroy is an essential property of ownership (ie: Jus Abutendi and Jus disponendi). That is to say: if you can't destroy it, you don't own it. In its current form, Arweave is narrowed to the public domain. Contributing users curate which data is kept permanently: certainly a great benefit. However, this can be improved with revocability to enable ownership-type behaviors. Since most data evolves over time, the ability of users to act more as owners of their data, rather than curators of data, would encourage more data to be kept with Arweave.

Example Scenario

If Bob wants to store a secret forever, Bob encrypts the secret and hands that encrypted data to Arweave for permanent storage. Later, Bob gets worried his key was exposed. Bob decides to re-encrypt the secret with a new key and once again hand the encrypted data to Arweave. However, without the ability to delete, Bob's vulnerable secret remains readily available on Arweave forever.

This proposal would allow Bob to REVOKE the vulnerable data. Of course once the data is distributed, there is no guarantee it can ever be eliminated. For example, Alice may choose to run a node that never deletes data. However, with appropriate disincentives through a REVOKE, the Bob's vulnerable data can be effectively dimished/dropped from the Arweave network at Bob's sole discretion.

Specification

Transactions are extended to carry an optional REVOCABLE flag.

Revoked transactions:

  1. cease earning mining rewards for storage
  2. must not be transmitted on the Arweave network (may be enforced through audit)
  3. may carry additional incentives/disincentives
Revocable Summary Description
true transaction is revocable The data remains forever, unless and until revoked. Revoked transactions cease earning any reward for mining on that data and are effectively purged from the Arweave network.
false transaction is irrevocable The data is kept forever and retains strong censorship resistance

Censorship Resistance

Without proper care, the ability to REVOKE would also enable an adversary to compel the removal of data. However, this avoided in two ways:

  1. If the original transaction does not carry the REVOCABLE flag, the data will remain permanently (as currently implemented)
  2. The original transaction could specify the address which can REVOKE. If a transaction has an independent REVOCABLE address, an adversary cannot prove who possesses the private key to sign a REVOKE transaction. Of course, if this is insufficient protection, the user should choose to send irrevocable transactions instead.

Non-Compliance

Non-compliant nodes may retain REVOKED data. This risk is unavoidable. However, it can be mitigated:

  1. non-compliant nodes incur the expense of storing the data with no predictable reward.
  2. it is possible for nodes to audit each other to ensure any request for REVOKED data receives a NULL response. Non-compliant nodes found to transmit REVOKED data may be penalized or dropped
elliotsayes commented 2 years ago

An extension of this could be a new transaction type to reallocate a previous store transaction's tokens to new data. Your 'revocation' proposal would fit into this 'update' proposal as a special case of updating to null data.

The advantage of this extension is that it allows making small (and possibly frequent) mutations to large chunks of data while being reimbursed for old and useless copies of the data. If the size of the new file is greater, then the update transaction would commit additional tokens equal to the excess.

This proposal would open up Arweave to be more affordable/practical in the myriad of use cases where only the latest (or latest n) versions the data are required. Essentially Arweave would upgrade from only a permanent, immutable archive to also offer permanently available, mutable storage, which is much closer to Arweave's permanent dropbox marketing claim.

I am also a big fan of your idea for a flag to indicate whether future revocations/updates are permitted (in my opinion a seperate flag for each is superfluous).

solarsailorneo commented 2 years ago

This seems to keep Arweave's archival capabilities. I could imagine including a ubiquitous tag in websites that specify whether the information presented is immutable or not.

RdeWilde commented 2 years ago

This is actually the main feature that is keeping us from using Arweave. I do like the promise that is is stored permanent, but I don't need forever. If permanent would be considered a agreeable (minimal) period, that would make sense.

If I have data that I want to keep for 5 years minimum, that could just become garbage afterwards. I'd like to have the option to pay for 5 years, with the option to extend the TTL, but not "forever". If not extended the data could be purged as there's no longer being paid for. There's very little use cases where this makes sense.

joshbenaron commented 2 years ago

I guess in theory you could add a restriction to the proof-of-access algorithm which only allows proofs which accept challenge bytes from a random non revoked transaction.

The non-compliance section is an issue. 1 - miners who hold on to data which has been revoked won't care about the cost as they're holding on to that data for another reason. 2 - They can pretend to not have it, but still store it.

I think there's a major issue with these transparent features on blockchains. It reminds me of Solana's upgradeable smart contract model which has led to thousands/millions of dollars disappearing due to a developer mistake (or perhaps a more malicious actor). So there's a question of being "rugpull-prone" which this proposal definitely add as a serious vector. Imagine an NFT which is revokable, it's a disaster waiting to happen.

I guess it's a question of "do you want to add any feature which could lead to any form of censorship?". And the issue is that adding optionality here only leads to an added attack vector there. Users of the system won't know (or should know) about revokable transactions. In order for Arweave to be adopted by the entire world there needs to be trust in the system, and this adds reason to distrust it.

Sidenote: @RdeWilde what you're describing is perpetual temporary storage and lacks the censorship resistance aspect