masa-finance / masa-bittensor

Bittensor Subnet Config
https://masa.ai
MIT License
9 stars 11 forks source link

spike: [Validator] Setting weights doesn't work #49

Closed hide-on-bush-x closed 3 months ago

hide-on-bush-x commented 4 months ago

set_weights

you can find this method defined in masa/base/validator.py

Problems encountered

Tried to sort some of them already but didn't reach any good state where it works

Error during the query and score process: 'numpy.ndarray' object has no attribute 'to'

  (
            processed_weight_uids,
            processed_weights,
        ) = bt.utils.weight_utils.process_weights_for_netuid(
            uids=self.metagraph.uids.to("cpu"),
            weights=raw_weights,
            netuid=self.config.netuid,
            subtensor=self.subtensor,
            metagraph=self.metagraph,
        )

On this snippet you can see self.metagraph.uids.to("cpu") this is failing because some kind of type missmatch Tried to sort this out by doing torch.from_numpy(self.metagraph.uids).to("cpu") and torch.from_numpy(self.metagraph.uids) But this led to Error during the query and score process: '<' not supported between instances of 'builtin_function_or_method' and 'int'. Found no solution for this, I think the torch.from_numpy think is not the right way

Maybe the metagraph.uids is returning an unexpected value, not sure

Not waiting for finalization or inclusion. Assume successful

PR I found with this https://github.com/opentensor/bittensor/pull/1692/commits/559ec182bbda95f37de2ecd693aae6f430b6f437

Tried setting weights manually from python console resulting on aboves "error". Even tho it returns True when doing subnet.W it returns an empty array

subtensor.set_weights(wallet=wallet, netuid=1, uids=[2], weights=[0.2])

Useful information

Found this thread where they talk about this topic: https://discord.com/channels/799672011265015819/799672011814862902/1209127894195109939

in order to see the the weights, you need to instantiate the subnet on the following way

 subnet = bt.metagraph(1, 'ws://54.205.45.3:9945', lite=False)
 subnet.W
 array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)

This way it returns the matrix and you can debug properly

After setting weights manually once I started getting more errors

like: BrokenPipeError: [Errno 32] Broken pipe or No attempt made. Perhaps it is too soon to set weights! - set_weights failed -

Update

Comented out the first part of the set_weights method and seems like its being set, I am not really sure about it but maybe that works for now

This way we are skipping calling process_weights_for_netuid that was failing no matter wah

    def set_weights(self):
        """
        Sets the validator weights to the metagraph hotkeys based on the scores it has received from the miners. The weights determine the trust and incentive level the validator assigns to miner nodes on the network.
        """

        # Check if self.scores contains any NaN values and log a warning if it does.
        if torch.isnan(self.scores).any():
            bt.logging.warning(
                f"Scores contain NaN values. This may be due to a lack of responses from miners, or a bug in your reward functions."
            )

        # Calculate the average reward for each uid across non-zero values.
        # Replace any NaN values with 0.
        raw_weights = torch.nn.functional.normalize(self.scores, p=1, dim=0)

        print(f"Raw weights: {raw_weights}")
        bt.logging.debug("raw_weights", raw_weights)
        # Process the raw weights to final_weights via subtensor limitations.
        # (
        #     processed_weight_uids,
        #     processed_weights,
        # ) = bt.utils.weight_utils.process_weights_for_netuid(
        #     uids=torch.from_numpy(self.metagraph.uids),
        #     weights=raw_weights,
        #     netuid=self.config.netuid,
        #     subtensor=self.subtensor,
        #     metagraph=self.metagraph,
        # )
        # bt.logging.debug("processed_weights", processed_weights)
        # bt.logging.debug("processed_weight_uids", processed_weight_uids)

        # Convert to uint16 weights and uids.
        (
            uint_uids,
            uint_weights,
        ) = bt.utils.weight_utils.convert_weights_and_uids_for_emit(
            uids=torch.from_numpy(self.metagraph.uids), weights=raw_weights
        )
        bt.logging.debug("uint_weights", uint_weights)
        bt.logging.debug("uint_uids", uint_uids)

        print("uint_weights", uint_weights)
        print("uint_uids", uint_uids)

        # Set the weights on chain via our subtensor connection.
        result, msg = self.subtensor.set_weights(
            wallet=self.wallet,
            netuid=self.config.netuid,
            uids=uint_uids,
            weights=uint_weights,
            wait_for_finalization=False,
            wait_for_inclusion=False,
            version_key=self.spec_version,
        )

        print(f"Result {result}")
        print(f"Msg {msg}")
        if result is True:
            bt.logging.info("set_weights on chain successfully!")
        else:
            bt.logging.error("set_weights failed", msg)
teslashibe commented 3 months ago

@hide-on-bush-x does this config help at all

https://docs.bittensor.com/subnets/subnet-hyperparameters

hide-on-bush-x commented 3 months ago

weights_rate_limit sounds like may help with the No attempt made. Perhaps it is too soon to set weights! - set_weights failed - thing

Thanks for the input @teslashibe

teslashibe commented 3 months ago

@hide-on-bush-x be sure to ask in the developer discord. I don't see you active there and they are incredibly responsive. https://discord.gg/dkW2CaMD

hide-on-bush-x commented 3 months ago

@teslashibe will ask about this after we wrap up the docs, thx

hide-on-bush-x commented 3 months ago

Found out some answers

BrokenPipeError: [Errno 32] Broken pipe means that the metagraph is out of sync ( Needs to reinstantiate the subtensor object or do metagraph.sync ) No attempt made. Perhaps it is too soon to set weights! - set_weights failed - Is just a rate limit, configured in the hyperparameters.

More info about hyperparameters in #61

grantdfoster commented 3 months ago

Hyperparams weights_rate_limit update:

Attempted to set the rate limit via weights_rate_limit on devnet netuid 1 (a subnet I created / have the owner wallet for). Getting the following error when running btcli sudo set --param weights_rate_limit --value 5 --netuid 1 --subtensor.chain_endpoint ws://54.205.45.3:9945:

raise InvalidScaleTypeValueException("Invalid byte for Compact")

Investigating further, will update here.

mudler commented 3 months ago

@grantdfoster can you share what's the status here? thank you!

grantdfoster commented 3 months ago

@mudler this has been fixed with both #62 and #82... solved with setting weights_rate_limit to a lower value and upgrading our devnet to the latest subtensor - this can be closed again!

Luka-Loncar commented 3 months ago

Perfect! Thanks Grant!