Joystream / orion

Atlas backend
GNU General Public License v3.0
7 stars 15 forks source link

Orion V2 initial implementation draft #48

Closed Lezek123 closed 1 year ago

Lezek123 commented 1 year ago

Orion V2 initial implementation draft

The draft implementation I'm describing in this document can be found HERE.

This repository is based on the current squid-substrate-template and contains all the input schemas, refactored Atlas queries, drafts of the custom graphql resolvers etc., as well as some basic setup and reference code to ilustrate some of the ideas of how certain issues can be addressed.

The official Subsquid documentation, which is often referenced throught this document can be found here: https://docs.subsquid.io/

How to run the local setup

I'll be explaining each part of the setup more deeply later in this document, but here is just a quick-start reference:

  1. Clone the Joystream repository (if not already done)
git clone https://github.com/Joystream/joystream.git
  1. Run the joystream-node service in the Joystream repository
# You can also specify the usual environment variables like `RUNTIME_PROFILE` etc.
export JOYSTREAM_NODE_TAG=$(./scripts/runtime-code-shasum.sh)
docker-compose up -d joystream-node
  1. Clone the Subsquid-Orion repository
cd ..
git clone https://github.com/Lezek123/subsquid-orion.git
  1. Run the archive (indexer)
cd subsquid-orion/archive
docker-compose up -d
  1. Build the processor
cd ..
npm install
make codegen
make build
  1. Run and migrate the processor datbase

    make up
    make migrate
  2. Run the processor

    make process
  3. Run the GraphQL server

    make serve

After performing those steps you should be able to go to http://localhost:4350/graphql and see something like this: orion-graphql

Currently the processor will produce some mock data on each block so you can also test some of the existing queries: orion-query

On-chain data indexing and processing

Squid Archive

Squid Arachive is analogous concept to the Hydra Indexer, it uses the Joystream node websocket rpc endpoint to fetch data about on-chain events and extrinsics and store it in a relational database (PostgreSQL).

We can configure the archive via a docker-compose file located in archive/docker-compose.yml.

The current squid archive configuration is using a local Joystream docker node (ws://joystream-node:9944) that runs on joystream_default network as a source.

SubstrateBatchProcessor

SubstrateBatchProcessor is a class we use to instantialize the events processor. As opposed to Hydra, where we would only implement the "mapping" functions (or "mappings"), Subsquid let's us instantialize and programatically configure the processor ourselves (manifest.yml file no longer required), which gives us more controll over its behavior.

SubstrateBatchProcessor is just one of the many processor implementations available in Subsquid, but it's the one currently recommended for processing substrate events and extrinsics. This specific processor implementation queries all blocks along with the events of interest from the Squid Arachive (using the @subsquid/substrate-gateway service). The maximum number of blocks in a single batch currently depends on the @subsquid/substrate-gateway implementation, it's still a little unclear how this will work in the future, but currently there are two main components that affect the batch size:

Current processor implementation:

In the current draft implementation:

This impelementation provides a decent general overview of how the "mappings" are written in Subsquid and how one can extract the events&data of interest from a batch and then perform bulk inserts/updates at the end of processing a batch, which considerably increases the performance.

The API

Input schema

The current input schema files can be found here: https://github.com/Lezek123/subsquid-orion/tree/main/schema I tried to preserve a similar schema to the one we currently use in Hydra, however there are some notable differences:

Custom models

Subsquid comes with a nice directory structure alowing us to define our own TypeORM models separately from the autogenerated ones, however they will all become part of the same database.

Use cases:

The primary use-case for definig those custom models is when we don't want Subquid to autogenerate the public api endpoints for quering certain (private) data, but we still want to keep this data as part of the same database to take advantage of the relational model. Take User entity for example. We want to be able to connect users with channels through User>-ChannelFollow-<Channel relationship, but we don't necessarily want to expose any User data through the api, that's why we define custom models for User and ChannelFollow, but we don't include those entities in the input schema.

Custom GraphQL api extensions

Subsquid allows use to add some custom extensions to the autogenerated GraphQL api. Those are stored in the src/server-extension/ library and constitute a significant part of the project.

Custom type-graphql Resolvers

Custom type-graphql resolvers are classes where we can define our custom GraphQL queries, mutations and subsriptions that will then be included in the final API.

Normally we run a Subsquid graphql server using the @subsquid/graphql-server library/service, which generates and runs a GraphQL server based on the input schema. For the purpose of generating the final ("output") schema and resolves it uses another library called @subsquid/openreader. The schema generated by @subsquid/openreader is then merged with the schema generated from our custom resolvers that we are providing in src/server-extension/resolvers. For this merge, the mergeSchemas method from graphql-tools library is used.

The interesting property of mergeSchemas is that this method also merges all individual GraphQL types defined in both schemas, which makes us able to reuse the autogenerated types like Video, VideoWhereInput, VideoOrderByInput etc. All we have to do is define a graphql object with the same name in our resolvers space and at least one property which matches with the autogenerated object (for entities it can be, for example, id: string). Then when the types are merged, we will get a consistent Video object with all the expected properties in the final schema.

This can be probably better understood by looking at the implementation inside https://github.com/Lezek123/subsquid-orion/tree/main/src/server-extension/resolvers, especially https://github.com/Lezek123/subsquid-orion/blob/main/src/server-extension/resolvers/baseTypes.ts where the "placeholders" for the to-be-autogenerated types are defined.

There are also many other useful references in this directory:

Use-cases summary for custom resolvers:

checkRequest plugin

The checkRequest plugin is a Subsquid feature that allows us act on the Apollo server's requestDidStart event. The handler function can be implemented inside src/server-extension/checkRequest.ts and recieves information like request headers, ip of the origin, all the data specific to the graphql request etc.

The current example implementation shows how this plugin can be used to introduce some authentication for all mutation requests.

Use cases:

Atlas queries: Refactored!

I refactored all existring Atlas queries (https://github.com/Joystream/atlas/tree/master/packages/atlas/src/api/queries) to match the new schema.

The results can be seen here. The directory structure matches the one in the Atlas repository, which makes it easy to do side-by-side comparison. I also added CHANGE: comment in all places where the changes where introduced.

The most notbale changes can be observed in the notifications / events queries, due to the refactorization of Event entities. It is now easier to query all the events of interest together and apply filtering, sorting and limit on the results of one query instead of making separate queries for each event type and then post-processing the results client-side.

Some other notbale changes include:

Custom migrations: Setting up the database

Subsquid allows us to generate database migration files that we can then use to setup the processor database. Besides that, we can also specify some custom migrations that will be ran before or after the generated ones. In the draft implementation I introduced 2 custom migrations: Views and Indexes (since the filenames and class names need to include timestamp I've just choosen some arbitrarly high values to make sure those migrations are always ran after the autogenerated one)

Use cases for custom migrations:

Performance

Using the mocked data I did some performance tests against the current implementation, here are some results:

  1. GetExtendedBasicChannels query

Arguments:

  where: { activeVideosCount_gt: 2 },
  orderBy: createdAt_DESC,
  limit: 50

Number of channel entries: 12,921 Number of video entries: 257,400 Time to execute the query: 86ms

  1. GetNotifications query

Arguments:

  channelId: "1",
  memberId: "1",
  limit: 50

Number of event entries: 2,574,000 Time to execute the query: 880ms

  1. GetNftHistory query

Arguments:

  nftId: "1"

Number of event entries: 2,974,000 Time to execute the query: ~9 seconds (!)

Potential candidate for optimalization

Benchmarks to be continued...

Known issues and unresolved questions

Alternatives to consider

"Manually" setting up graphql server instead of using @subsquid/graphql-server

To have more control of the setup we can run the graphql server from within our own codebase instead of using the @subsquid/graphql-server, we can still take advantage of @subsquid/openreader however to generate the initial schema and resolvers.

Pros of this approach:

Cons:

dmtrjsg commented 1 year ago

I removed collaborators field from the Channel entity. This is mainly for simplification purposes and to reduce the initial scope of work, they can of course be added later if needed;

This is used for YPP project, and added to channels with single permission to add videos, used for the purpose of YPP auto sync feature.

searchChannels query (allows implementing custom channel search logic), searchVideos query (allows implementing custom video search logic)

What are the realms of possibility how search relevancy can be defined? Would it be possible to define it as mix of title/ description text match (fuzzy), date uploaded, nft minted, video views, channel subscribers defined as some sort of polynom?

kdembler commented 1 year ago

I'm amazed we managed to get a working draft that fast. All the improvements like deeper filtering and especially nested field queries will make our lives much easier. Looks very promising

@dmtrjsg

This is used for YPP project, and added to channels with single permission to add videos, used for the purpose of YPP auto sync feature.

We only add collaborators, we don't actually read them from QN, so not a problem

dmtrjsg commented 1 year ago

MM from 13 Dec:

Status

Most mappings are done, some mappings are still WIP, NFT mappings are to be done

Features Round Up

v1 vverall objective is to relaunch Orion using the immediate benefits, and build a solid foundation for future expansions in the areas of Orion2 accounts, Notifications, Advanced Featuring, Advanced Search Relevancy, Scalability and Performance improvements.

  1. Filtering by operator must be supported - must be in v1 release
  2. Search is just string match now - (enhanced search can be left out of v1 release)
  3. Categories can be processed even the ones that are not interested - we do not need pre-processing for speed in V1
  4. Featuring will need to be replicated from Orionv1
  5. Some deep technical issues on speeding things up - can be left out of v1

Next Steps

  1. Atlas team efforts led by @drillprop will focus on trying to Integrate Atlas with Orion2, listing all places that needs to be addressed in a separate dedicated issue to be created
  2. We will meet once a week on Tuesdays @dmtrjsg to set up the recurring meeting same time as today to discuss the progress and sync up on next steps. Next meeting is Tuesday 20th.
  3. Atlas team has Orion2 as new top priority. (Payouts/ YPP bugfixing and release assistance is priority 0, and Playlists is prio2)
dmtrjsg commented 1 year ago

MM Dec 20th:

Orion2 All mappings are now completed incl NFT Mutations are added to graph ql that were supported by Orion Adding support for custom categories No custom queries just yet.

Atlas Pr opened with new queries are re-written, some things are not working mentioned in the PR.

Next Milestone Prepare testable environment to test the functionality after all of the queries are re-written. Test environment would entail deployment of Orion2 pointing to Atlas dev.

Remaining effort on Orion2 side

50% spanning custom queries, testing and benchmarking.

dmtrjsg commented 1 year ago

MM: 10th January 2023

Orion side

Finished the views. All the videos and channels should not appear, censored, filtered out will not be displayed by default Added mutation to set the supported categories by operator and it also supports settings whether videos with no categories will be supported, and recently added categories that did not exist before. Most viewed videos connection query added. Extended channels query added (ones with most views). Resolved custom parser issues.

Full list of the queries to implement are tracked in this issue:

🟢 Action: @Lezek123 to add some indicative timelines for the groups of tasks indicated in the tracker issue for better coordination with Atlas teams pace.

Atlas side

Small changes added after follows and views were added on the Orion2 side. Updated PR (as prev mentioned) with playground for Orion but haven't started testing latest changes. Memberships creation, uploading videos etc in the new integrated environment will be attempted today if time allows after Gleev launches.

Follow ups will be shared after testing.

🟢 Action: @drillprop will create the tracker issue similar to #52 to track the progress of testing and remaining effort.

dmtrjsg commented 1 year ago

Comments on scope:

dmtrjsg commented 1 year ago

Release Update

⚠️ All queries for Atlas functioning are now complete 🥳 ETA on the first release now strictly depends on the outcome of manual testing. Positive scenario of first release estimates: in 2 weeks time subject to outcomes of manual testing 🔜

Orion

First release scope will not include

Currently Lezek is working on the

Atlas

On the Atlas side, no tests done yet in the light of other priorities last week, but will be the second priority after "Apps as Channels" work is done and unblocking Radek to progress with "Apps as first class citizens"

Manual testing tracker issue:

dmtrjsg commented 1 year ago

@Lezek123 we've concluded that Orionv2 scope needs to be increased to accommodate for the Apps feature:

There's an open question whether CLI interface needs to be built for Orion now to create channels, so that we don't duplicate this functionality in QN and Orion. Smth for you to advise us on. I'll share this in Discord and we can all discuss there.

dmtrjsg commented 1 year ago

Orion

New issues added:

Details in the tracker issue:

Atlas

ETA on resolving issues: few days, then about 1 week of testing.

Details in the tracker issue:

Lezek123 commented 1 year ago

Documentation links:

dmtrjsg commented 1 year ago

Release timeline

Release next week does not seem unrealistic.

Atlas

Everything is fixed now 👍 Testing is in active stage 👍 Notifications issue discovered before is now fixed (notiffs from all users coming to inbox) ✅ new Issue with NFT widget found, fixed ✅

❓ What to do with the sections, we have strange algos in Orionv1 like shuffling the order of the videos in the featured sections. Do we need to include this to v2? Yes, but it is unclear whether that can be replicated in v2. @drillprop, @Lezek123 can you pls add the details here as comments? If it is possible to take results + shuffle; plus unclear how to find out which entities to be shown on page 2,3. Solution to shuffle them on the client side.

Applies to

Homepage

⚠️ Remainder tests are due to be done in the next few days, possibly tomorrow by eod. (subject to Apps workstream)

Orion2

Pending: Apps as first class citizens mappings + one mutation ⚠️ <- top Prio. 1-2 days. Finish testing on

ℹ️ No operator queries yet, which allows to explore private entities and make channel and video reports. Not critical.

dmtrjsg commented 1 year ago

Release Approach Update

If we release Orionv2 after Apps and YPP the mappings have to be maintaned on both QN and Orionv2. So there's an argument to releases it together with YPP and Apps. Equally there is a risk of releasing wo Orion2 (like channel deletion etc..)

@attemka to partake in review of Bartosz' PRs and second pair of eyes would be beneficial.

After the data comparison the confidence in Orion2 is pretty high, so the consensus is leaning towards a joint release: YPP + Apps + Orion2.

Progress update

Atlas

Ground work is done. Testing every query was performed. Some small issues detected, can be dealt separately on Atlas side.

Specifically:

Other than above we are good to use Orion2 on ehpesus env for subsequent more testing.

Orion

Other priorities took the attention. Comparison script was almost finished to compare performance. Deleting the channels that have subscribers caused some problems; One issue with previous bid - for open auction wasn't set on Orion1. -> Decided to preserve the current behaviour in Orion2.

Tasks left before release: Orionv2 needs to include apps as members, due
Operator queries are out of scope for the first release. Update the changelog based on the findings of the benchmarking script. Also would be nice to compare the data from the mainnet (run orionv1 and orionv2 and compare the data)

All of the above should be possible this week, but taking time till next Tuesday in the light of Ephesus work/ reviews.

⚠️ it will be also possible to exclude the content from the platform (videos, nfts, channels). It is now done on the Atlas side. With Orion2 this feature can be removed completely. is_public_tracking should not be removed. is_censored will be automatically excluded;

⚠️ excluded videos are managed via operator mutations queries: (screenshot from changelog)

Screenshot 2023-02-07 at 14 19 58

⚠️ ⚠️ Operator will not be seeing excluded content as a result of this managed on the Orion2 side. ☝️ To allow for the pessimistic approach where only approved videos will be displayed; this would require additional change on Orion and Atlas sides.

dmtrjsg commented 1 year ago

MM 14.02.23

 Atlas

Everything is done, awaiting Playground deployment for extensive testing Test scripts/ scenarios to be prepared @drillprop will take care of this ⭐

NB: Bathing is not supported on Orion2 no-cache is not implemented, but its a low prio enhancement:

 Orion

Release

We need a playground which has: Apps branch Atlas+QN+ Master Runtime + Orionv2 + colossus/argus of master branch ⏭️ next week as we want App actions to be included to Orionv2 @attemka will try to do it next week with support from @Lezek123 and @mnaamani

Other notes

⚠️ Orionv2 improvements not added yet: @drillprop will raise the issues on Atlas side.

dmtrjsg commented 1 year ago

MM 28.02

Release

YPP release first with QN dependency. Orionv2 last tasks this week to be tested on the current PG with Orionv2 enabled. Releasable date circa next week, after YPP is released to Prod (after YPP is released).

Envs

@Lezek123 will deploy Orionv2 on same instance where current PG is, so we can test before release.

Atlas

Testing is almost complete, last two outstanding issues (work the same as in Orionv1).

We are currently using events query we, and recently these were updated to more performant ones. These need to be introduced to Atlas before release. @drillprop will raise issue and work on this in this sprint, the work is small, and notifications need to be retested (Half day).

Orion

Fix to be applied for outstanding issues in

YPP

YPP does not depend on Orion. It depends on QN only.

❓ we need to migrate queries from QN to Orionv2 for YPP?

dmtrjsg commented 1 year ago

1️⃣ Objective: make Atlas concurrently testable for YPP+Apps; Orionv2.1; Ephesus branches

Tasks:

__

2️⃣ Objective: Orionv2 release Resolved URLs is now implemented on both Orionv2 and Atlas.

dmtrjsg commented 1 year ago

Release

Aiming end of next week release of Orionv2; with YPP+Apps released early next week.

Atlas

Release branch is ready: https://atlas-git-orion-v2-joystream.vercel.app/ To test:

Orion

Everything is ready for release.

dmtrjsg commented 1 year ago

Release date: 23rd in the morning ⭐

Q: are we going to use the same node that we use for Orionv1? Or do we want to add more firepower for further boost in performance? Optimising for geo location?

A: decided to launch on the same machine, and boost the specs or migrate the instance to a diff node if required.

⚠️ @zeeshanakram3 to review this pr before the release:

dmtrjsg commented 1 year ago

Release date: was reviewed in the light of APPs + YPP release, decided to move to this week.

Atlas: Bartosz created PR on Assets mapping

We need to merge the changes for Orionv2 on Atlas side to dev branch

Orion:

⚠️ @zeeshanakram3 to review this pr before the release:

to be done by end of today or tomorrow morning.

@Lezek123 run the script of data comparison between Orionv2 mainnet and QN

Data migration: there's an instance where the data was migrated and that was tested.

Release execution plan (steps):

On Thursday 11am CET we are starting to execute the release.

  1. Atlas is set to maintenance mode @attemka
  2. Create the backup of the data base and migration files @Lezek123
  3. Deploy Orionv2 (merge Orioinv2 to master) @Lezek123
  4. Atlas dev to master merge @attemka
  5. Gleev update (code and env variables) @attemka
  6. Excluded content (videos and channels) and add featured content @drillprop
  7. Atlas is out of maintenance mode @attemka
  8. Smoke test by @attemka @drillprop @dmtrjsg

Release comms will happen in Orion Channel.