Joystream / joystream

Joystream Monorepo
http://www.joystream.org
GNU General Public License v3.0
1.42k stars 115 forks source link

Apps as first-class citizens #4307

Closed bedeho closed 7 months ago

bedeho commented 1 year ago

Background

When an application brings a creator to the platform, or a creator publishes a video, often this is positive spillover for the system as a whole. It makes the system more attractive for other applications, consumers and creators, and it also allows the DAO to capture value from downstream activities associated with this publishing, like NFT fees, gateway burns, etc. This means that it is efficient for the DAO to directly subsidies such onboarding in a way which is informed by how much value a creator brings to the platform down stream, for example as measured by various burns or shares of creator payouts. Such subsidisation however requires attribution. It must be known what application actually was the vehicle for capturing and onboarding the creator and content. Such attribution has another strong benefit. When content is published with authentic application attribution, other applications receive a signal about the properties of the content being published, by virtue of things like their creator/content pool post processing steps (screening, transcoding, ...). Lastly, attribution involves creating an application index which will serve as an excellent public index of what exactly is being built on and served on the platform at any given time.

Proposal

Introduce applications at the metaprotocol level, and wrap existing metaprotocol messaging inside authenticated application action messages. The reason we put applications at the metaprotocol is that we don't need automated value transfer, e.g. x% of creator payout going to onboarding application. This is not only overkill right now, but we are unlikely to pick the exact right approach out of the box, so sticking to a more light weight and flexible approach is better.

Member & Content lead remarks

We make applications owned by either members or the DAO itself, as presented by the content lead. We add the following new messages with obvious semantics to be used over content lead or member remark extrinsics:

Content Directory Creation Extrinsics

As preface note that we still allow people to create channels and publish just like before.

We introduce a new message AppAction that represents the use of some content directory functionality through a specific application. with fields

The reason raw_action is not locked to a specific set of messages is that it allows apps to start playing with distinct metadata formats while still being able to signal that actions are through a specific app. The app action commitment refers to all of the information relevant to identifying a distinct action. There are two types of runtime-level actions with which application actions can be combined channel creation and video creation. Arguably editing videos could also be covered, but it's more reasonable to just start with the basics.

Channel creation

Here the following extrinsic is used

pub fn create_channel(
    origin,
    channel_owner: ChannelOwner<T::MemberId, T::CuratorGroupId>,
    params: ChannelCreationParameters<T>,
)

pub struct ChannelCreationParametersRecord
...
    /// Assets referenced by metadata
    pub assets: Option<StorageAssets>,
    /// Metadata about the channel.
    pub meta: Option<Vec<u8>>,
    /// Map from collaborator's MemberId to collaborator's ChannelAgentPermissions
    pub collaborators: BTreeMap<MemberId, ChannelAgentPermissions>,
    /// Storage buckets to assign to a bag.
    pub storage_buckets: BTreeSet<StorageBucketId>,
    /// Distribution buckets to assign to a bag.
    pub distribution_buckets: BTreeSet<DistributionBucketId>,
    /// Commitment for the channel state bloat bond.
    pub expected_channel_state_bloat_bond: Balance,
    /// Commitment for the data object state bloat bond for the storage pallet.
    pub expected_data_object_state_bloat_bond: Balance,
}

In this case the

NB: be aware of replay attack mentioned by @kdembler here https://github.com/Joystream/joystream/issues/4307#issuecomment-1288837094

Video creation

Here the following extrinsic is used

pub fn create_video(
    origin,
    actor: ContentActor<T::CuratorGroupId, T::CuratorId, T::MemberId>,
    channel_id: T::ChannelId,
    params: VideoCreationParameters<T>,
)

and its basically an analogous story, just include channel_id in commitment (no time!).

NB: be aware of replay attack mentioned by @kdembler here https://github.com/Joystream/joystream/issues/4307#issuecomment-1288837094, same issue exists for videos.

mochet commented 1 year ago

Without wanting to overcomplicate this issue, I would propose a slight expansion to this in an ideal world.

My thoughts are that this app database should be expanded to basically include everything of use to the Joystream platform.

If we had this kind of data of tools, tutorials and so on stored and maintained in a good manner, it would be an invaluable resource and save thousands of manhours in the long term of having people update wiki style pages. Things like:

A good example is this page: https://wiki.polkadot.network/docs/build-tools-index (Last updated on Sep 18, 2022) Brief stats to highlight how many manhours have probably been spent maintaining this single page:

During the early testnet phases we had an extensive problem of "off site" documentation copies being made which were never accountable or documented, basically making a complex mess of things.

This may not be the best solution or the best place for it, but documentation and tools are a common Achilles heel of the web3/DAO world and it seems like this idea could perhaps be expanded (if not now then maybe at some stage in the future).

I am curious if the app_id could be used similarly by users to signal the value of documents and if some moderation layer could be added to prevent malicious actions... but this probably ventures into replacing git tooling at some stage.

bedeho commented 1 year ago

This may not be the best solution or the best place for it, but documentation and tools are a common Achilles heel of the web3/DAO

I agree with this, this whole layer has a very specific purpose, and it may even become nativ in the future. Simply documenting resources available should be done in some other way.

kdembler commented 1 year ago
  1. Re app metadata: looks good but not sure why distinction between platform and versions is necessary. Maybe I'm misunderstanding but I would just go with a single platforms entry that accepts an array like ['web', 'native_osx']
  2. Following your Channel creation example, do I understand correctly that this would be roughly the call to create_channel?
    
    const rawAction = objectToProtoBytes({
    createChannel: {
    title: 'Channel title',
    // ...
    },
    })
    const appActionMetadata = 'unused?'

create_channel(chanelOwner, { // ... meta: objectToProtoBytes({ appAction: { app_id: 'app_id', metadata: appActionMetadata, raw_action: rawAction, signature: signMessage([serialize(channelOwner), serialize(assets), rawAction, appActionMetadata]), }, }), })


3. If I'm not mistaken, with the proposed signature scheme, a single signature from the app operator could be replayed any number of times by a single member, as long as they create new channels with the same metadata and assets. I guess `appActionMetadata` could be used to prevent that?
bedeho commented 1 year ago
  1. yeah probably not, cancelled.
  2. this looks right, but obviously signMessage has to be executed by server side which holds app auth_key, but conceptually yes.
  3. yes, this is right! one would need to use the number of channels ever created by this channel_owner as a nonce, this would have to be maintained in QN, as runtime does not need this state for any direct verification. Ill add a remark about this to the initial sketch. Same isssue exists for videos.
kdembler commented 1 year ago

Looks good to me then!

kdembler commented 1 year ago

@bedeho just occurred to me - currently the app commitment doesn't have an easily identifiable structure. That is, given serialized message ready for signature, it's not easy for the backend to verify that the message is actually an app commitment. We wouldn't want the backend to sign any arbitrary data with its key. Maybe we introduce some constant prefix to the message so it can be validated?

traumschule commented 1 year ago

to basically include everything of use to the Joystream platform

@mochet I want to give specific but also general feedback on this point that it will help to first provide a sample of mentioned info to then implement UI that displays it. Can also be mock data.

bedeho commented 1 year ago

@kdembler

just occurred to me - currently the app commitment doesn't have an easily identifiable structure. ... it's not easy for the backend to verify that the message is actually an app commitment

So what we are looking at is serialize(channel_owner)||serialize(assets)||AppAction::raw_action||AppAction::metadata, the commitment. So you may be right, as the docs say that SCALE is not self-descriptive, so if you just feed an instance of that whole schema into a decoder, it probably will not discover the boundaries between channel owner and assets. I think this just means the API between client app and backend requires client to send these fields as distinct inputs that are then used server-side to build the commitment before signing. If client is going to send the full final transaction, as it would if its not a hosted app or a collaborator mode, then the signature would be sent back, and the client could also verify its validity before sending extrinsic.

What do you think?

kdembler commented 1 year ago

@bedeho

I think there may actually be 2 parts to this:

  1. My original point was that the backend doing the signing (Orion) should probably verify in some way that the message it's signing is actually an app commitment and not an arbitrary message (for example I verify that I'm hacked). Current implementation just accepts any bytes message and will sign everything. While it's not an obvious attack vector, that's probably something worth avoiding. I think an easy solution for this is just to add a constant prefix to all the app commitments (for example APP_COMMITMENT), so that Orion will only sign messages starting with that
  2. I think what you mention in the comment above is related but actually I'm not sure if it's an issue. If there's prefix and arbitrary messages are avoided, I don't think Orion needs to decode the commitment, it could just sign it blindly. If that's the case then I imagine only QN does decoding to verify that the signature actually matches the app key. However, QN should have access to all the fields that go into the commitment so it should be able to create the commitment on its own and verify that it matches what was signed.

Does it make sense?

bedeho commented 1 year ago

Ok, then I don't think I really appreciate why this is important to solve on its own?

I see many possible use cases for Orion actually enforcing some rules on the upload flow before allowing someone to publish into its category space. For example, maybe it wants to filter out content with certain very bad key words from titles, or enforce some sort of more complex rule based on AppAction::metadata and who the publisher is. I don't really see what problem the prefix approach solves, but I would venture to say that the deeper validation I am describing would solve this as well.

Sidenote: I think that when transcoding works, then I suspect Orion would be constructing all of this server-side purely anyway, and write to chain as a collaborator, or even as a custodian if we have custodial keys. In which case all of this goes away as an important issue, at least for any app that uses Orion.

kdembler commented 1 year ago

I don't really see what problem the prefix approach solves

It's a mechanism to stop people from abusing the application key and signing any data with it, app commitment or not. With the current WIP Orion implementation, you could even construct a blockchain transaction and request a signature for it.

but I would venture to say that the deeper validation I am describing would solve this as well.

Agreed.

Sidenote: I think that when transcoding works, then I suspect Orion would be constructing all of this server-side purely anyway, and write to chain as a collaborator

In my head, content metadata would still be written on-chain by the client application and Orion would later use its collaborator/custodian access only to update the assets, but that of course requires deeper thought.

bedeho commented 1 year ago

It's a mechanism to stop people from abusing the application key and signing any data with it, app commitment or not. With the current WIP Orion implementation, you could even construct a blockchain transaction and request a signature for it.

I don't think that is a real problem per say, because the key used for app operators should not be the same key you use for an actuall account! I took this as obvious, but probably this is such a room for gigantic error that Orion should almost block you from that. Actually, what signature scheme was chosen for this key, is it the same one as blockchain level accounts? @WRadoslaw

WRadoslaw commented 1 year ago

So, right now we use sr25519 signature, which I believe is the same as the Polkadot accounts one. I can change it to ed25519 (quick change since polkadot/crypto-utils offers it out of the box)

bedeho commented 1 year ago

I can change it to ed25519 (quick change since polkadot/crypto-utils offers it out of the box)

Ok great 👍 , lets do that, just one less possible source of error.

Is there some issue for adding this sort of signing stuff to Orionv2? perhaps changing the endpoints as described can be noted there.

kdembler commented 1 year ago

I don't think that is a real problem per say, because the key used for app operators should not be the same key you use for an actuall account! I took this as obvious, but probably this is such a room for gigantic error that Orion should almost block you from that. Actually, what signature scheme was chosen for this key, is it the same one as blockchain level accounts?

I don't think the problem is actually signing transaction, this was just one example. The issue is much more generic, which is exposing private key for public to use as their wish but it seems you don't think that's an issue so let's ignore then

WRadoslaw commented 1 year ago

@bedeho Got a few questions about implementation:

  1. Which information coming from AppAction should be saved to the final entity (channel, video), is appId enough or we should make a relation to App entity as well?
  2. What if the signature on the AppAction is invalid? Should we allow the user to retry connecting the app on channel update tx?
  3. I guess the most important is the way of serializing. RN I don't really know how to approach this from a technical standpoint. serialize(channel_owner)||serialize(assets)||AppAction::raw_action||AppAction::metadata suggests that we should choose one of those and make a signature on it or should it serialize all of them and then pack it in vector and serialize again (if so couldn't we just serialize final vector)? The second problem is serialize in itself, do you have any library in mind for this job? I have found paritytech lib that can help with it, but I'm not sure whether it's up to me to choose.
bedeho commented 1 year ago

Which information coming from AppAction should be saved to the final entity (channel, video), is appId enough or we should make a relation to App entity as well?

I think a nullable relationship to the app entity would be ideal, but nothing more is required I think. I think that should allow queries like "list all videos from app with id={}", since we now have relationship filtering, but we can probably drop app deletion, as that would break that reference and has no real value imho. Agree?

What if the signature on the AppAction is invalid? Should we allow the user to retry connecting the app on channel update tx?

No, in that case the appid relationship is just null.

I don't really know how to approach this from a technical standpoint. ... suggests that we should choose one of those and make a signature on it or should it serialize all of them and then pack it in vector and serialize again (if so couldn't we just serialize final vector)?

Ah, no so || in applied cryptography means concatenate, its not or. If it had been or, the signature would not commit to all important information, and you would get into serious problems with people making fake actions.

The second problem is serialize in itself

serialize can really be any sort of bijection that outputs something that you ultimately can sign using your signing library, you don't even need to use the same serialized for the channel_owner and assets. I'm sure the first thing you can think of is perfectly plausible, if you are still unsure, I think @zeeshanakram3 or @Lezek123 will def. have great suggestions.

Warning

I am going to post a slight change to this feature in order to allow https://github.com/Joystream/joystream/issues/4158, a long standing request from @mochet and a no-brainer. It basically allows apps to also have native financing, value capture and council payouts, again managable through Atlas, and apps can also be bought and sold wholesale over trust-less OTC, because channels have this. This means that app creation should not happen from memberships, but during channel creation. Will write out full requirements, later, just proceed as if only this issue here is of concern, just giving you a headsup :)

Lezek123 commented 1 year ago

metadata: raw metadata, can be anything.

The reason raw_action is not locked to a specific set of messages is that it allows apps to start playing with distinct metadata formats while still being able to signal that actions are through a specific app

Question from Klaudiusz:

If I'm not mistaken, with the proposed signature scheme, a single signature from the app operator could be replayed any number of times by a single member, as long as they create new channels with the same metadata and assets. I guess appActionMetadata could be used to prevent that?

Current answer:

yes, this is right! (...)


I'm having a hard time trying to understand how this should be implemented, here are some of the things that are not clear to me:

  1. If the metadata (or appActionMetadata) must include the nonce in order to be valid, then as I understand it cannot really be anything. In this case it needs to have a structure like:
    message AppActionMetadata {
    optional uint32 nonce = 1;
    optional string anything = 2;
    }
  2. As I understand this, a given instance of the query node (or any event processor more broadly) may not be able to decode AppActionMetadata.anything or even AppAction.raw_action, as different apps may choose to use some specific metadata format there, but it should always be able to at least verify that the signature and nonce are valid, to connect given video/channel to a given app. If raw_action is not understood, it is just treated as empty channel/video metadata.
bedeho commented 1 year ago

If the metadata (or appActionMetadata) must include the nonce in order to be valid, then as I understand it cannot really be anything. In this case it needs to have a structure like:

If we are going to care about this, then it might as well just be turned into a part of the standard, so just bake nonce into primary AppAction message. My vague memory, without rereading this whole thread, was that it seemed OK without it, but I'm ok with actually adding it if you think it's important.

Be aware that we have already started to use the metadata field in channel creation context https://github.com/Joystream/youtube-synch/issues/139

If you want to make these changes, make sure to coordinate with @WRadoslaw @drillprop quickly, as they have not incorporated this new style.

As I understand this

I did undersand what exactly you were asking here, but I agree with everything.