huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.47k stars 26.89k forks source link

Disable tqdm. #34438

Open ivanstepanovftw opened 1 week ago

ivanstepanovftw commented 1 week ago

System Info

transformers==4.45.2

Who can help?

No response

Information

Tasks

Reproduction

import asyncio
import logging

import numpy as np
import torch
import tqdm.asyncio as tqdm
from tqdm.contrib.logging import logging_redirect_tqdm
from transformers import AutoModel

logger = logging.getLogger(__name__)

async def main():
    text_encoder_model = AutoModel.from_pretrained("jinaai/jina-embeddings-v3", trust_remote_code=True)

    if torch.cuda.is_available():
        text_encoder_model.to(torch.device('cuda'))
        text_encoder_model = text_encoder_model.to(torch.bfloat16)

        for i in tqdm.tqdm(range(100), desc="Encoding and commiting", unit="messages", unit_scale=32):
            embedding = text_encoder_model.encode("Hello", task="retrieval.passage", truncate_dim=128 * 6)
            logger.info(f"{np.mean(embedding, axis=0)}")
            await asyncio.sleep(0.5)

if __name__ == '__main__':
    # logging.basicConfig(level=logging.ERROR, format="[+%(relativeCreated)d ms] [%(asctime)s] %(message)s", datefmt="%H:%M:%S")
    logging.basicConfig(level=logging.INFO, format="[+%(relativeCreated)d ms] [%(asctime)s] %(message)s", datefmt="%H:%M:%S")
    logger.setLevel(logging.DEBUG)

    with logging_redirect_tqdm(loggers=[logger]):
        asyncio.run(main())

Expected behavior

No useless Encoding: 1/1 progress bar.

ivanstepanovftw commented 1 week ago

Do you like it?

Encoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.78s/it]
0.0004168510204181075                                                                                                                                         
Encoding and commiting:   0%|                                                                                                  | 0/3200 [00:01<?, ?messages/s]
[+6905 ms] [03:19:49] 0.0004168510204181075                                                                                                                   
Encoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.17it/s]
0.0004168510204181075                                                                                                                                         
Encoding and commiting:   1%|▉                                                                                        | 32/3200 [00:02<03:46, 14.00messages/s]
[+7474 ms] [03:19:50] 0.0004168510204181075                                                                                                                   
Encoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.54it/s]
0.0004168510204181075                                                                                                                                         
Encoding and commiting:   2%|█▊                                                                                       | 64/3200 [00:02<02:05, 25.08messages/s]
[+8041 ms] [03:19:50] 0.0004168510204181075
Encoding: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.58it/s]
0.0004168510204181075                  
Encoding and commiting:   3%|██▋                                                                                      | 96/3200 [00:03<01:32, 33.61messages/s]
[+8607 ms] [03:19:51] 0.0004168510204181075
Encoding and commiting:   3%|██▋                                                                                      | 96/3200 [00:03<02:03, 25.16messages/s]

While there is no way to disable tqdm from parameters, or using solutions found in (#14889 #30733, #9275), I also cannot be sure you using correct tqdm from tqdm.asyncio. This may cause deadlocks or whatever thread locks happened when you try to use tqdm in asynchronous context.

ivanstepanovftw commented 1 week ago

Combination of the following lines of code works:

    logging.basicConfig(level=logging.ERROR, format="[+%(relativeCreated)d ms] [%(asctime)s] %(message)s", datefmt="%H:%M:%S")
    logger.setLevel(logging.DEBUG)

    with logging_redirect_tqdm():

This approach worked, but it feels unnecessarily bloated. Having to redirect logging just to suppress unwanted tqdm bars suggests that the library could be streamlined to offer a more straightforward way of disabling internal progress bars/logging.

hlky commented 1 week ago

In this case the progress bar comes from the remote modeling code, try passing show_progress_bar=False to encode.