Define CI tests for validation of testnet deployment

juanmanso commented 3 months ago

Goal

Have a CI that runs integration testing for deployment of the environment to testnet (and in the future, production).

What needs to be defined

How would containers be built?
How would the environment be spin up?
Which environment variables should be used?
- Discord Bot Token
- Twitter Credentials
- For the future -> AI Keys
Which suite of tests should take place?

References

Mention of task here

Related issues

Follows-up roadmap#19
Blocked by #85

Acceptance criteria

[ ] Explore what tests might make sense to have for validating testnet deployment
[ ] We know the areas of the code that is going to be tested
[ ] We know what test cases and how we are going to set up the tests (e.g. #108)

hide-on-bush-x commented 3 months ago

I think it will be hard to achieve this on testnet ( because of the tTAO required to test ) what if we change this to devnet?

@juanmanso @5u6r054

juanmanso commented 3 months ago

Just followed your lead on this comment of yours @hide-on-bush-x

I would love to be able to test this with multiple machines but I was the only one that was able to open the ports for the miners, sooo... I think this is done. Unless we want to add this to the docs which I guess would be a good idea

I see two possible follow ups:

[ ] Task for adding this checklist to the docs

[ ] Task for completing a full E2E test with distributed servers ( Could we use EC2 instances for this @5u6r054? any news on the docker for the miner/validator? )

Maybe we could have our own set of validators and miners deployed on testnet and ping them to have an E2E test there.

Feel free to adjust this task as you see fit 💪

hide-on-bush-x commented 3 months ago

That would be great too, but I was thinking about the CI and the docker, we need some way to get fresh wallets and in testnet that wont be possible

juanmanso commented 3 months ago

We could regenerate wallets based on their private keys hosted on GitHub Secrets and run those wallets on CI as well

mudler commented 3 months ago

@hide-on-bush-x this card is to be triaged and wasn't planned as part of the sprint. It's fine for now as you have already work in progress, but this shouldn't have been picked up - also - this was blocked by https://github.com/masa-finance/masa-bittensor/issues/85

hide-on-bush-x commented 3 months ago

Gotcha, thx @mudler

mudler commented 3 months ago

Marking as blocked by https://github.com/masa-finance/masa-bittensor/issues/85

Luka-Loncar commented 3 months ago

@5u6r054 @juanmanso @hide-on-bush-x can you please update the ticket description with acceptance criteria and add follow up tasks after spike is done?

hide-on-bush-x commented 3 months ago

Testing plan

How would containers be built?
- Docker
- Only Validator and/or miner
- In order to reduce waiting times i would suggest that we commit to add a little bit more detailed commits saying if the change impacts miner or validator ( or both ). If we change just the miner we use an already deployed validator instance in order to test the miner behaviour. If the other way around, we change the validator, we use an already deployed miner. we never deploy the masa oracle for testing this, that must be a remote instance. In case we change both miner and validator we can deploy both.
- GH action should be able to read commits for this, a little example could be feat(validator): changed reward system this would only trigger a new validator node for the testing
How would the environment be spin up?
- If this is expected to run on testnet there is no environment to be spin up
- ( if this was about the node environment ): Docker
Environment variables
- There are a few things we need to define for the tests to run
  - Ports for miner and validator
  - Masa protocol url ( we may need to define a testnet oracle with the oracle team and @5u6r054 , maybe spin up a domain for that )
  - Wallets: we must define wallets that are going to be used for this ( given that test TAO is hard to find we should be smart with this )

Integration tests

There are a few things we may want to test here, things such as :

Response accuracy ( Given an input to an endpoint it answers correctly ): miner/validator
Given a response validator is able to set scores and weights on the testnet ( We can check all this )

Requirements: Inputs datasets for each endpoint thats going to be tested e.g for twitter profile we are going to query for getmasafi profile

Static tests ( potentially the tests that are going to run on CI )

We must test forward methods from miner and validator
Utils
Rewards calculations
Miner uid selection?
Blacklists

Requirements:

Nox setup
Test data
Mock masa protocol?

juanmanso commented 3 months ago

Testing plan

How would containers be built?

Docker

Only Validator and/or miner

In order to reduce waiting times i would suggest that we commit to add a little bit more detailed commits saying if the change impacts miner or validator ( or both ). If we change just the miner we use an already deployed validator instance in order to test the miner behaviour. If the other way around, we change the validator, we use an already deployed miner. we never deploy the masa oracle for testing this, that must be a remote instance. In case we change both miner and validator we can deploy both.

GH action should be able to read commits for this, a little example could be feat(validator): changed reward system this would only trigger a new validator node for the testing

I'd say that if we structure our dockerfiles properly so the very last step is the COPY . . statement or something like that, we could speed up the process of building the image very much.

Regarding your last point, I'd say we stick to some bash/git snippet that checks the diff for changed files and if some of the target files were changed, the GH actions runs. Is not that convoluted and let's devs use commits as they see fit (but I agree commits should be rather small and concise)

juanmanso commented 3 months ago

How would the environment be spin up?

If this is expected to run on testnet there is no environment to be spin up

( if this was about the node environment ): Docker

If we want to validate after the deployment, then I see your point. However, when I wrote the question I thought we should test before the deployment. If we follow that assumption, then I thought we could have some docker compose which spins up a miner and a validator that connect to the testnet, etc.

juanmanso commented 3 months ago

Integration tests

There are a few things we may want to test here, things such as :

Response accuracy ( Given an input to an endpoint it answers correctly ): miner/validator

Given a response validator is able to set scores and weights on the testnet ( We can check all this )

Requirements: Inputs datasets for each endpoint thats going to be tested e.g for twitter profile we are going to query for getmasafi profile

I'd ask to go deeper here. How would we do that?

IMO we could test response accuracy by making the same request to the masa protocol ourselves and compare answers.

For the metagraph info, which commands should we run for that? @hide-on-bush-x

juanmanso commented 3 months ago

Everything else on your comment LGTM

hide-on-bush-x commented 3 months ago

Will expand on that, probably will writte some code too, thx!

hide-on-bush-x commented 3 months ago

Here is an example of how we could check that the setting weights method is working:

import requests
import bittensor
import time

# Connect to a subnet on the testnet
subnet = bittensor.metagraph(165, "test", lite=False)
initial_weights = subnet.W
print(initial_weights)

# Make a REST request
response = requests.get('http://localhost:8000/data/twitter/profile/getmasafi')
if response.status_code == 200:
    print('Successfully retrieved data from localhost')
    print(f'Response: ${response.json()}')
else:
    print('Failed to retrieve data from localhost')

print("Waiting a few secconds...")
time.sleep(10)
subnet.sync()

after_request_weights = subnet.W
print(after_request_weights)

# Here check that the new weight's table makes sense

And for checking just the scores we can read the scores file stored locally ( thats easier )

hide-on-bush-x commented 3 months ago

Will create a few tasks for this

hide-on-bush-x commented 3 months ago

Follow up tasks

https://github.com/masa-finance/masa-bittensor/issues/117 https://github.com/masa-finance/masa-bittensor/issues/118 https://github.com/masa-finance/masa-bittensor/issues/119 https://github.com/masa-finance/masa-bittensor/issues/120 https://github.com/masa-finance/masa-bittensor/issues/121

masa-finance / masa-bittensor