opentensor / bittensor

Internet-scale Neural Networks
https://www.bittensor.com/
MIT License
851 stars 301 forks source link

"logging.info" is still broken from version 6.10.0 #1836

Open jbonilla-tao opened 3 months ago

jbonilla-tao commented 3 months ago

Describe the bug

It seems the new update 6.12.0 is mandatory to use testnet now. However, logging is still broken starting from version 6.10.0.

bt.logging.info doesn't show up. only warn and error show up

https://discord.com/channels/1120750674595024897/1179557631061413968/1235755142377373756

To Reproduce

bt.logging.info doesn't show up. only warn and error show up

Expected behavior

bt.logging.info doesn't show up. only warn and error show up

Screenshots

No response

Environment

Unix BT version >= 6.10.0 Python 3.10.13

Additional context

No response

jbonilla-tao commented 2 months ago

Adding this to the code seems to fix the issue

    if config.logging.debug:
        bt.logging.enable_debug()
    elif config.logging.trace:
        bt.logging.enable_trace()
    else:
        bt.logging.enable_default()

        Thanks @ceterum1
ceterum1 commented 2 months ago

I havent looked at the Bittensor code in detail, but I think the issue has two parts: 1) The old parameters --logging.debug and --logging.trace no longer work 2) The default loglevel is set to 0 because enable_default() is not called correctly during the logger init

This code:

import bittensor as bt
import time

bt.logging.info("First INFO message")
bt.logging.error("First ERROR message")
bt.logging.warning("First WARNING message")
bt.logging.debug("First DEBUG message")
bt.logging.trace("First TRACE message")
bt.logging.success("First SUCCESS message")

time.sleep(1)

print(f'Log level before enable_default: {bt.logging.get_level()}')
bt.logging.enable_default()
print(f'Log level after enable_default: {bt.logging.get_level()}')

bt.logging.info("Second INFO message")
bt.logging.error("Second ERROR message")
bt.logging.warning("Second WARNING message")
bt.logging.debug("Second DEBUG message")
bt.logging.trace("Second TRACE message")
bt.logging.success("Second SUCCESS message")

time.sleep(1)

print(f'Log level before enable_debug: {bt.logging.get_level()}')
bt.logging.enable_debug()
print(f'Log level after enable_debug: {bt.logging.get_level()}')

time.sleep(1)

bt.logging.debug("Second DEBUG message")
bt.logging.trace("Second TRACE message")

print(f'Log level before enable_trace: {bt.logging.get_level()}')
bt.logging.enable_trace()time.sleep(1)
print(f'Log level after enable_trace: {bt.logging.get_level()}')

time.sleep(1)

bt.logging.trace("Third TRACE message")

Results the following log entries:

Log level before enable_default: 0
2024-06-20 08:10:19.361 |      ERROR       |  - First ERROR message - 
2024-06-20 08:10:19.362 |     WARNING      |  - First WARNING message - 
Log level after enable_default: 20
Log level before enable_debug: 20
2024-06-20 08:10:19.364 |       INFO       |  - Second INFO message - 
2024-06-20 08:10:19.364 |      ERROR       |  - Second ERROR message - 
2024-06-20 08:10:19.364 |     WARNING      |  - Second WARNING message - 
2024-06-20 08:10:19.364 |     SUCCESS      |  - Second SUCCESS message - 
2024-06-20 08:10:19.365 |       INFO       | bittensor:loggingmachine.py:298 | Enabling debug.
2024-06-20 08:10:19.366 |       INFO       | bittensor:loggingmachine.py:305 | Debug enabled.
Log level after enable_debug: 10
Log level before enable_trace: 10
2024-06-20 08:10:19.366 |      DEBUG       | bittensor:loggingmachine.py:354 |  - Second DEBUG message - 
Log level after enable_trace: 5
2024-06-20 08:10:19.367 |       INFO       | bittensor:loggingmachine.py:276 | Enabling trace.
2024-06-20 08:10:19.367 |       INFO       | bittensor:loggingmachine.py:283 | Trace enabled.
2024-06-20 08:10:19.368 |      TRACE       | bittensor:loggingmachine.py:349 |  - Third TRACE message - 
(.venv) joonas@heimomaster:~/bt-dev/llm-defender-subnet$ python test.py 
2024-06-20 08:11:30.618 |      ERROR       |  - First ERROR message - 
2024-06-20 08:11:30.618 |     WARNING      |  - First WARNING message - 
Log level before enable_default: 0
Log level after enable_default: 20
2024-06-20 08:11:31.623 |       INFO       |  - Second INFO message - 
2024-06-20 08:11:31.623 |      ERROR       |  - Second ERROR message - 
2024-06-20 08:11:31.624 |     WARNING      |  - Second WARNING message - 
2024-06-20 08:11:31.624 |     SUCCESS      |  - Second SUCCESS message - 
Log level before enable_debug: 20
2024-06-20 08:11:32.626 |       INFO       | bittensor:loggingmachine.py:298 | Enabling debug.
Log level after enable_debug: 10
2024-06-20 08:11:32.629 |       INFO       | bittensor:loggingmachine.py:305 | Debug enabled.
Log level before enable_trace: 102024-06-20 08:11:33.631 |      DEBUG       | bittensor:loggingmachine.py:354 |  - Second DEBUG message - 

Log level after enable_trace: 5
2024-06-20 08:11:33.634 |       INFO       | bittensor:loggingmachine.py:276 | Enabling trace.
2024-06-20 08:11:33.636 |       INFO       | bittensor:loggingmachine.py:283 | Trace enabled.
2024-06-20 08:11:34.637 |      TRACE       | bittensor:loggingmachine.py:349 |  - Third TRACE message - 

We implemented a workaround in SN14 with custom logging parameter to get the Bittensor>=7.0.0 into use due to other reasons, so the code snippet @jbonilla-tao mentioned wont work for everyone.

I think this can be fixed quite easily in Bittensor by making sure the set_default() is called in the logger init and if any of the --logging.X parameters have been set and adjust the init accordingly. I can take a look at it and submit a pull request, but I wont be able to do it for at least a few weeks, so anyone is free to also submit a pull request with the fix.

There are also couple of other issues we have noticed with the logging when running Bittensor==7.2.0. While some are not directly related to Bittensor and some are personal preferences, I though to still mention them here: 1) Quite a many subnets are using PM2 as a tool to run miners and validators. The colours do not work anymore with PM2 even when setting FORCE_COLOR=1, which makes the logs more difficult to read 2) Sometimes the log message is split into multiple lines:

0|llm-defender-subnet-miner-0-testnet  | 2024-06-19 19:31:36.016
0|llm-defender-subnet-miner-0-testnet  |  | 
0|llm-defender-subnet-miner-0-testnet  |     SUCCESS     
0|llm-defender-subnet-miner-0-testnet  |  |  - Processed payload synapse from UID: 9 - Confidence: 0.0 - UUID: 37b3196f-a088-4af1-ac92-1d88a724ee10 - 

This should all be on a one-line, like this: 1|llm-defender-subnet-miner-1-testnet | 2024-06-19 19:25:56.363 | SUCCESS | - Processed payload synapse from UID: 9 - Confidence: 0.0 - UUID: b1172882-fa29-4b2a-a89d-84b781d58af2 - 3) Extra characters are added into the message field (- in the start and end of the message). I think it would be better to keep the message field (i.e., whatever is logged with the logger command) unmodified as it makes it easier for people to collect and process the logs. 4) Logs seem to be at times coming in incorrect order (most likely related to asyncio). Timestamps are correct so not a huge problem, but can cause some confusion when the messages are not displayed in correct order. You can kinda see this in the sample I had earlier (prints and logging messages are not aligned).

jbonilla-tao commented 2 months ago

Indeed the loss of colors in the pm2 logs is a blocker for us right now. Considering replacing all logging with a different library