masa-finance / masa-bittensor

Bittensor Subnet Config
https://masa.ai
MIT License
0 stars 0 forks source link

Set up subtensor, miner (with masa protocol worker) and validator using docker compose #85

Closed Luka-Loncar closed 2 days ago

Luka-Loncar commented 3 weeks ago

Goals:

  1. Have github actions triggered by merges to the 3 envs (dev, test, main) to build a complete masa subtensor network for testing. It will have a few different containers running under docker compose:

These items are done individually, mostly, the problem we now face has to do with the timing for the setting of the weights. when the validator tries to run the "boost" it gets an error saying it hasn't waited tempo/2 blocks, which is true, (but that's a lot of blocks (360/2) according to hyperparameters) --

Currently the validator setup waits for 2250 seconds before trying to do the "boost" / set the weights. This works but is quite a long time to wait.

Trying to make the request to get a discord profile fails, see comment.

mudler commented 2 weeks ago

@5u6r054 are you working on it?

Can you update the card to give more context ?

Luka-Loncar commented 2 weeks ago

@5u6r054 small reminder to provide more context.

mudler commented 2 weeks ago

I guess some initial work around this has been done in : https://github.com/masa-finance/masa-bittensor/pull/101 and https://github.com/masa-finance/masa-bittensor/pull/102

5u6r054 commented 2 weeks ago

This ticket doesn't really capture all what I'm doing on this. This has been discussed in meetings and Mati and I planned to start working on this together, generally, last week, though other issues with maintaining devnet had blocked me. Finally yesterday I had time to work on this, although I started with dockerizing and testing the subnet first, then working on creating wallets for subnet owner, validator, and miner, and funding those wallets via the subnet docker instance so that the subnet owner can register a subnet, and then the validator and miner can register to it, etc.

Mati was focusing on the validator and miner part, and funding them from the continuously running devnet, not from a dockerized local subnet.

I am also focused on multi-environment deployment and cicd of all components, not just the validator and miner.

So - this ticket isn't really covering all that, and also, Mati is probably doing more for this ticket specifically (maybe)? I don't know.

Also, this ticket is just a title and hasn't got any description or exit criteria

5u6r054 commented 2 weeks ago

Filled in a more detailed scope for this ticket and new title to match what I'm actually doing and what I intend for the complete version of this task.

5u6r054 commented 2 weeks ago

Also, linked my current draft PR to this, and added @grantdfoster @obasilakis @hide-on-bush-x and myself as assignees

5u6r054 commented 2 weeks ago

This ticket may only be partially complete by the end of the sprint - it's more of a thing than it seemed to be in its original version.

Luka-Loncar commented 1 week ago

Should we maybe split it in more tickets to make it more modular from delivery perspective?

5u6r054 commented 1 week ago

@Luka-Loncar maybe we could split it into one ticket concerned with having docker-compose stand up all desired components, and another subsequent ticket about putting it into place as a github action?

5u6r054 commented 1 week ago

@obasilakis made progress on this and I'm picking up where he left off, which is after subnet registration, trying to set hyperparameters on it.

5u6r054 commented 1 week ago

@obasilakis @grantdfoster my question now is this:

after setting up the validator (registering it, using it to set hyperparameter of weights_rate_limit, successfully), wait a bit, then try:

btcli root boost --netuid 1 --increase 1 --wallet.name validator --wallet.hotkey validator_hotkey --subtensor.chain_endpoint ws://subtensor_machine:9945

and it fails with this error:

validator_machine  | Raw Weights -> Normalized weights: 
validator_machine  |         [0. 1. 0.] -> 
validator_machine  |         [0. 1. 0.]
validator_machine  | 
validator_machine  | Do you want to set the following root weights?:
validator_machine  |   weights: [0. 1. 0.]
subtensor_machine  | 2024-06-27 01:51:24 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #30 (0x5037…ee87), ⬇ 0.6kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:24 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #30 (0x5037…ee87), ⬇ 0.5kiB/s ⬆ 0.6kiB/s    
miner_machine      | Miner running... 1719453087.4422514
subtensor_machine  | 2024-06-27 01:51:29 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #31 (0x6b8d…a1d7), ⬇ 0.4kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:29 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #31 (0x6b8d…a1d7), ⬇ 0.5kiB/s ⬆ 0.4kiB/s    
miner_machine      | Miner running... 1719453092.4431078
subtensor_machine  | 2024-06-27 01:51:34 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #31 (0x6b8d…a1d7), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:34 💤 Idle (1 peers), best: #33 (0xd8ea…0d42), finalized #31 (0x6b8d…a1d7), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:36 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:51:36 do_set_root_weights( origin:9a60c3b2a3adc0bdce4d36f2c194c2198d30977b8ddca666ac7f570d913b102a (5FZ7yb2b...) netuid:0, uids:[1], values:[65535])    
subtensor_machine  | 2024-06-27 01:51:36 check_version_key( network_version_key:0, version_key:0 )    
subtensor_machine  | 2024-06-27 01:51:36 ✨ Imported #34 (0x5ed8…88ce)    
subtensor_machine  | 2024-06-27 01:51:36 🙌 Starting consensus session on top of parent 0xd8eaf36803966a744aa235e7def451dca65778cf7eadb31f4c10a9d0cdbf0d42    
subtensor_machine  | 2024-06-27 01:51:36 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:51:36 do_set_root_weights( origin:9a60c3b2a3adc0bdce4d36f2c194c2198d30977b8ddca666ac7f570d913b102a (5FZ7yb2b...) netuid:0, uids:[1], values:[65535])    
subtensor_machine  | 2024-06-27 01:51:36 check_version_key( network_version_key:0, version_key:0 )    
subtensor_machine  | 2024-06-27 01:51:36 🎁 Prepared block for proposing at 34 (3 ms) [hash: 0x020dcd4402e33b71c269e436e4b7c9e7a3987078db6950120c94d942da6ba634; parent_hash: 0xd8ea…0d42; extrinsics (2): [0xce47…62be, 0x76d1…10a2]    
subtensor_machine  | 2024-06-27 01:51:36 🔖 Pre-sealed block for proposal at 34. Hash now 0x5ed8edd19a4bc934e91c8354a57edc49033b52bb5f5199023d9e71352e5288ce, previously 0x020dcd4402e33b71c269e436e4b7c9e7a3987078db6950120c94d942da6ba634.    
subtensor_machine  | 2024-06-27 01:51:36 ✨ Imported #34 (0x5ed8…88ce)    
miner_machine      | Miner running... 1719453097.443883
subtensor_machine  | 2024-06-27 01:51:39 💤 Idle (1 peers), best: #34 (0x5ed8…88ce), finalized #32 (0xdc58…5685), ⬇ 0.5kiB/s ⬆ 0.6kiB/s    
subtensor_machine  | 2024-06-27 01:51:39 💤 Idle (1 peers), best: #34 (0x5ed8…88ce), finalized #32 (0xdc58…5685), ⬇ 0.6kiB/s ⬆ 0.5kiB/s    
miner_machine      | Miner running... 1719453102.4453526
subtensor_machine  | 2024-06-27 01:51:44 💤 Idle (1 peers), best: #34 (0x5ed8…88ce), finalized #32 (0xdc58…5685), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:44 💤 Idle (1 peers), best: #34 (0x5ed8…88ce), finalized #32 (0xdc58…5685), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
miner_machine      | Miner running... 1719453107.4471545
subtensor_machine  | 2024-06-27 01:51:48 🙌 Starting consensus session on top of parent 0x5ed8edd19a4bc934e91c8354a57edc49033b52bb5f5199023d9e71352e5288ce    
subtensor_machine  | 2024-06-27 01:51:48 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:51:48 🎁 Prepared block for proposing at 35 (3 ms) [hash: 0x49dd1b29bf324ab68ebf6eb2e67d9964a1737ecc5d0f021428c02165834d2632; parent_hash: 0x5ed8…88ce; extrinsics (1): [0x5376…5617]    
subtensor_machine  | 2024-06-27 01:51:48 🔖 Pre-sealed block for proposal at 35. Hash now 0x5c2dcda7e3566c9ee52aa5daa11879ed802859a4e3f42c10f097e86b0b3d8b6e, previously 0x49dd1b29bf324ab68ebf6eb2e67d9964a1737ecc5d0f021428c02165834d2632.    
subtensor_machine  | 2024-06-27 01:51:48 ✨ Imported #35 (0x5c2d…8b6e)    
subtensor_machine  | 2024-06-27 01:51:48 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:51:48 ✨ Imported #35 (0x5c2d…8b6e)    
subtensor_machine  | 2024-06-27 01:51:49 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:49 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
miner_machine      | Miner running... 1719453112.4478164
subtensor_machine  | 2024-06-27 01:51:54 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:54 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
miner_machine      | Miner running... 1719453117.4485605
subtensor_machine  | 2024-06-27 01:51:59 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:51:59 💤 Idle (1 peers), best: #35 (0x5c2d…8b6e), finalized #33 (0xd8ea…0d42), ⬇ 0.5kiB/s ⬆ 0.5kiB/s    
subtensor_machine  | 2024-06-27 01:52:00 🙌 Starting consensus session on top of parent 0x5c2dcda7e3566c9ee52aa5daa11879ed802859a4e3f42c10f097e86b0b3d8b6e    
subtensor_machine  | 2024-06-27 01:52:00 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:52:00 🎁 Prepared block for proposing at 36 (2 ms) [hash: 0x64a4affd6c3c55d05ce976a510d6fda7cf7af67599e06a30df95932a1c5d6519; parent_hash: 0x5c2d…8b6e; extrinsics (1): [0x8403…48e5]    
subtensor_machine  | 2024-06-27 01:52:00 🔖 Pre-sealed block for proposal at 36. Hash now 0x0b57f4435d07850a9efdaf7b9d81662cfbf79f3b88c6833f7cc1feb19e1d8ca7, previously 0x64a4affd6c3c55d05ce976a510d6fda7cf7af67599e06a30df95932a1c5d6519.    
subtensor_machine  | 2024-06-27 01:52:00 ✨ Imported #36 (0x0b57…8ca7)    
subtensor_machine  | 2024-06-27 01:52:00 Successfully ran block step.    
subtensor_machine  | 2024-06-27 01:52:00 ✨ Imported #36 (0x0b57…8ca7)    
validator_machine  |   uids: [0 1 2]? [y/n]: False
validator_machine  | {
validator_machine  |     'type': 'Module',
validator_machine  |     'name': 'SettingWeightsTooFast',
validator_machine  |     'docs': [
validator_machine  |         'The hotkey is attempting to set weights twice within the duration of 
validator_machine  | net_tempo/2 blocks.'
validator_machine  |     ]
validator_machine  | }
validator_machine  | ❌ Failed: {'type': 'Module', 'name': 'SettingWeightsTooFast', 'docs': ['The 
validator_machine  | hotkey is attempting to set weights twice within the duration of net_tempo/2 
validator_machine  | blocks.']}
validator_machine  | 2024-06-27 01:52:01.303 |     WARNING      | Set weights -  - Failed: {'type': 'Module', 'name': 'SettingWeightsTooFast', 'docs': ['The hotkey is attempting to set weights twice within the duration of net_tempo/2 blocks.']}

in the list of hyperparameters, i see a setting "tempo" that is 360 for the subnet, and 100 for the root and 99 for the #3 one:

subnet_machine     |               Subnet Hyperparameters - NETUID: 1 - unknown               
subnet_machine     |  HYPERPARAMETER                    VALUE                 NORMALIZED      
subnet_machine     |    rho                             10                    10              
subnet_machine     |    kappa                           32767                 0.4999923705    
subnet_machine     |    immunity_period                 5000                  5000            
subnet_machine     |    min_allowed_weights             1                     1               
subnet_machine     |    max_weight_limit                65535                 1               
subnet_machine     |    tempo                           360                   360             
subnet_machine     |    min_difficulty                  18446744073709551615  1               
subnet_machine     |    max_difficulty                  18446744073709551615  1               
subnet_machine     |    weights_version                 0                     0               
subnet_machine     |    weights_rate_limit              100                   100             
subnet_machine     |    adjustment_interval             360                   360             
subnet_machine     |    activity_cutoff                 5000                  5000            
subnet_machine     |    registration_allowed            True                  True            
subnet_machine     |    target_regs_per_interval        1                     1               
subnet_machine     |    min_burn                        1                     τ0.000000001    
subnet_machine     |    max_burn                        100000000000          τ100.000000000  
subnet_machine     |    bonds_moving_avg                900000                4.878909776e-14 
subnet_machine     |    max_regs_per_block              1                     1               
subnet_machine     |    serving_rate_limit              50                    50              
subnet_machine     |    max_validators                  64                    64              
subnet_machine     |    adjustment_alpha                17893341751498265066  0.97            
subnet_machine     |    difficulty                      10000000              5.421010862e-13 
subnet_machine     |    commit_reveal_weights_interval  1000                  1000            
subnet_machine     |    commit_reveal_weights_enabled   False                 False  

How is this set? Should I try setting it to something lower to wait less time?

Another option I thought would be building subtensor with feature "fast-blocks" but I tried that, and I kept getting another problem error we've seen before regarding the PoW block being in the past or negative...

5u6r054 commented 1 week ago

Waiting long enough allows the "boost" command to be executed without an error (currently waiting 2250 seconds).

The validator registers and starts, but - requests to the api endpoint to get a discord profile fails:

validator_machine  | 2024-06-27 03:13:14.189 |      ERROR       |  - Error during the handle responses process: [Errno 32] Broken pipe - 
validator_machine  | Traceback (most recent call last):
validator_machine  |   File "/app/masa/validator/discord/profile/forward.py", line 33, in forward_query
validator_machine  |     return await self.forward(request=Request(query=query, type=RequestType.DISCORD_PROFILE.value), get_rewards=get_rewards, parser_object=DiscordProfileObject)
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/app/masa/validator/forwarder.py", line 56, in forward
validator_machine  |     self.validator.set_weights()
validator_machine  |   File "/app/masa/base/validator.py", line 242, in set_weights
validator_machine  |     ) = bt.utils.weight_utils.process_weights_for_netuid(
validator_machine  |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/utils/weight_utils.py", line 269, in process_weights_for_netuid
validator_machine  |     min_allowed_weights = subtensor.min_allowed_weights(netuid=netuid)
validator_machine  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/subtensor.py", line 3491, in min_allowed_weights
validator_machine  |     call = self._get_hyperparameter(
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/subtensor.py", line 3189, in _get_hyperparameter
validator_machine  |     if not self.subnet_exists(netuid, block):
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/subtensor.py", line 3997, in subnet_exists
validator_machine  |     _result = self.query_subtensor("NetworksAdded", block, [netuid])
validator_machine  |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/subtensor.py", line 2912, in query_subtensor
validator_machine  |     return make_substrate_call_with_retry()
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/decorator.py", line 232, in fun
validator_machine  |     return caller(func, *(extras + args), **kw)
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/retry/api.py", line 73, in retry_decorator
validator_machine  |     return __retry_internal(partial(f, *args, **kwargs), exceptions, tries, delay, max_delay, backoff, jitter,
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/retry/api.py", line 33, in __retry_internal
validator_machine  |     return f()
validator_machine  |            ^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/bittensor/subtensor.py", line 2903, in make_substrate_call_with_retry
validator_machine  |     return self.substrate.query(
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/substrateinterface/base.py", line 987, in query
validator_machine  |     block_hash = self.get_chain_head()
validator_machine  |                  ^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/substrateinterface/base.py", line 441, in get_chain_head
validator_machine  |     response = self.rpc_request("chain_getHead", [])
validator_machine  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/substrateinterface/base.py", line 272, in rpc_request
validator_machine  |     self.websocket.send(json.dumps(payload))
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/websocket/_core.py", line 297, in send
validator_machine  |     return self.send_frame(frame)
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/websocket/_core.py", line 337, in send_frame
validator_machine  |     l = self._send(data)
validator_machine  |         ^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/websocket/_core.py", line 559, in _send
validator_machine  |     return send(self.sock, data)
validator_machine  |            ^^^^^^^^^^^^^^^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/websocket/_socket.py", line 179, in send
validator_machine  |     return _send()
validator_machine  |            ^^^^^^^
validator_machine  |   File "/opt/bittensor-venv/lib/python3.12/site-packages/websocket/_socket.py", line 156, in _send
validator_machine  |     return sock.send(data)
validator_machine  |            ^^^^^^^^^^^^^^^
validator_machine  | BrokenPipeError: [Errno 32] Broken pipe
validator_machine  | INFO:     192.168.65.1:24370 - "GET /data/discord/profile/691473028525195315 HTTP/1.1" 200 OK
5u6r054 commented 6 days ago

Closing notes,

The masa protocol used in this setup is not local to the setup, it's a running staked masa protocol node with discord ability that is set up using a variable. A masa protocol node local to the setup could be added later if we wish, since we now have a masa-faucet / ability for it to get its own test masa for staking, it would not be hard.

Other than that, this is a self-contained masa-subtensor network that starts from zero, creating:

  1. a subtensor instance (downloads version 1.1.2 subtensor from opentensor and starts it up)
  2. a subnet-creator instance (it makes a wallet, gets funds, registers a new subnet),
  3. a validator-creator (it makes a wallet, gets funds, registers on the root, sets hyperparameter weights_rate_limit to 1, registers on the new subnet, boots subnet weights, then starts the validator up)
  4. a miner-creator (creates a wallet, gets funds, registers miner -- creates a miner that talks to an external persistent masa-protocol node that has discord ability)