Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.51k stars 3.39k forks source link

Progress Bar prints many lines when validation_step is defined #15283

Open davidgilbertson opened 2 years ago

davidgilbertson commented 2 years ago

Bug description

With a simple model like this:

class MyPLModule(pl.LightningModule):
    def __init__(self, model):
        super().__init__()
        self.model = model
        self.loss_func = nn.CrossEntropyLoss()

    def configure_optimizers(self):
        return optim.Adam(self.model.parameters(), lr=0.02)

    def training_step(self, batch: tuple[torch.Tensor, ...], batch_idx):
        x, y = batch
        preds = self.model(x)
        loss = self.loss_func(preds, y)
        return loss

    def validation_step(self, batch: tuple[torch.Tensor, ...], batch_idx):
        # I do nothing!
        return None

I get a console that looks like this. It runs the progress bar as it should up to about 86%, then spits out some validation stuff and from then on each update to the progress 'bar' is a new line:

image

If I don't have a validation_step method in my class, the progress bar works as it should.

How to reproduce the bug

No response

Error messages and logs

No response

Environment


* CUDA:
    - GPU:
        - NVIDIA GeForce RTX 3090
    - available:         True
    - version:           11.6
* Lightning:
    - pytorch-ignite:    0.4.10
    - pytorch-lightning: 1.7.7
    - torch:             1.12.1+cu116
    - torchaudio:        0.12.1+cu116
    - torchmetrics:      0.10.0
    - torchvision:       0.13.1+cu116
* Packages:
    - absl-py:           1.2.0
    - aiohttp:           3.8.1
    - aiosignal:         1.2.0
    - altgraph:          0.17.2
    - anyio:             3.6.1
    - argcomplete:       2.0.0
    - argon2-cffi:       21.3.0
    - argon2-cffi-bindings: 21.2.0
    - asttokens:         2.0.8
    - async-timeout:     4.0.2
    - attrs:             22.1.0
    - babel:             2.10.3
    - backcall:          0.2.0
    - beautifulsoup4:    4.11.1
    - black:             22.8.0
    - bleach:            5.0.1
    - blis:              0.7.8
    - bottleneck:        1.3.5
    - cachetools:        5.2.0
    - catalogue:         2.0.8
    - category-encoders: 2.5.0
    - certifi:           2022.6.15
    - cffi:              1.15.1
    - charset-normalizer: 2.1.1
    - click:             8.1.3
    - cmdstanpy:         1.0.7
    - colorama:          0.4.5
    - colour:            0.1.5
    - convertdate:       2.4.0
    - cycler:            0.11.0
    - cymem:             2.0.6
    - cython:            0.29.32
    - datasets:          2.4.0
    - debugpy:           1.6.3
    - decorator:         5.1.1
    - defusedxml:        0.7.1
    - deprecated:        1.2.13
    - dill:              0.3.5.1
    - distlib:           0.3.4
    - dtreeviz:          1.3.7
    - entrypoints:       0.4
    - ephem:             4.1.3
    - et-xmlfile:        1.1.0
    - executing:         1.0.0
    - fastai:            2.7.9
    - fastbook:          0.0.28
    - fastcore:          1.5.24
    - fastdownload:      0.0.7
    - fastjsonschema:    2.16.1
    - fastprogress:      1.0.3
    - filelock:          3.8.0
    - fonttools:         4.37.1
    - frozenlist:        1.3.1
    - fsspec:            2022.8.2
    - future:            0.18.2
    - google-auth:       2.11.0
    - google-auth-oauthlib: 0.4.6
    - graphviz:          0.20.1
    - greenlet:          1.1.2
    - grpcio:            1.48.1
    - hijri-converter:   2.2.4
    - holidays:          0.15
    - html5lib:          1.1
    - htmlmin:           0.1.12
    - huggingface-hub:   0.9.1
    - idna:              3.3
    - imagehash:         4.3.0
    - iniconfig:         1.1.1
    - ipykernel:         6.15.2
    - ipympl:            0.9.2
    - ipython:           8.5.0
    - ipython-genutils:  0.2.0
    - ipywidgets:        8.0.2
    - jedi:              0.18.1
    - jinja2:            3.1.2
    - joblib:            1.1.0
    - json5:             0.9.10
    - jsonschema:        4.15.0
    - jupyter:           1.0.0
    - jupyter-client:    7.3.5
    - jupyter-console:   6.4.4
    - jupyter-core:      4.11.1
    - jupyter-server:    1.18.1
    - jupyterlab:        3.4.6
    - jupyterlab-pygments: 0.2.2
    - jupyterlab-server: 2.15.1
    - jupyterlab-widgets: 3.0.3
    - kaggle:            1.5.12
    - kiwisolver:        1.4.4
    - korean-lunar-calendar: 0.2.1
    - langcodes:         3.3.0
    - lunarcalendar:     0.0.9
    - lxml:              4.9.1
    - markdown:          3.4.1
    - markupsafe:        2.1.1
    - matplotlib:        3.5.3
    - matplotlib-inline: 0.1.6
    - missingno:         0.5.1
    - mistune:           2.0.4
    - mplcursors:        0.5.1
    - mplfinance:        0.12.9b1
    - mpmath:            1.2.1
    - multidict:         6.0.2
    - multimethod:       1.8
    - multiprocess:      0.70.13
    - murmurhash:        1.0.8
    - mypy:              0.971
    - mypy-extensions:   0.4.3
    - nbclassic:         0.4.3
    - nbclient:          0.6.7
    - nbconvert:         7.0.0
    - nbformat:          5.4.0
    - nest-asyncio:      1.5.5
    - networkx:          2.8.6
    - notebook:          6.4.12
    - notebook-shim:     0.1.0
    - numexpr:           2.8.3
    - numpy:             1.23.3
    - oauthlib:          3.2.0
    - opencv-python:     4.6.0.66
    - openpyxl:          3.0.10
    - packaging:         21.3
    - pandas:            1.5.0
    - pandas-profiling:  3.2.0
    - pandocfilters:     1.5.0
    - parso:             0.8.3
    - pathspec:          0.10.1
    - pathy:             0.6.2
    - patsy:             0.5.2
    - pefile:            2021.9.3
    - phik:              0.12.2
    - pickleshare:       0.7.5
    - pillow:            9.2.0
    - pip:               22.3
    - pipx:              1.0.0
    - platformdirs:      2.5.2
    - playwright:        1.25.2
    - plotly:            5.10.0
    - pluggy:            1.0.0
    - preshed:           3.0.7
    - prometheus-client: 0.14.1
    - prompt-toolkit:    3.0.31
    - prophet:           1.1
    - protobuf:          3.19.4
    - psutil:            5.9.2
    - pure-eval:         0.2.2
    - py:                1.11.0
    - pyarrow:           9.0.0
    - pyasn1:            0.4.8
    - pyasn1-modules:    0.2.8
    - pycparser:         2.21
    - pydantic:          1.9.2
    - pydeprecate:       0.3.2
    - pyee:              8.1.0
    - pygithub:          1.55
    - pygments:          2.13.0
    - pyinstaller:       4.10
    - pyinstaller-hooks-contrib: 2022.2
    - pyjwt:             2.4.0
    - pymeeus:           0.5.11
    - pynacl:            1.5.0
    - pynvml:            11.4.1
    - pyparsing:         3.0.9
    - pyrsistent:        0.18.1
    - pytest:            7.1.3
    - python-dateutil:   2.8.2
    - python-slugify:    6.1.2
    - pytorch-ignite:    0.4.10
    - pytorch-lightning: 1.7.7
    - pytz:              2022.2.1
    - pywavelets:        1.3.0
    - pywin32:           304
    - pywin32-ctypes:    0.2.0
    - pywinpty:          2.0.7
    - pyyaml:            6.0
    - pyzmq:             23.2.1
    - qtconsole:         5.3.2
    - qtpy:              2.2.0
    - regex:             2022.8.17
    - requests:          2.28.1
    - requests-oauthlib: 1.3.1
    - responses:         0.18.0
    - rsa:               4.9
    - scikit-learn:      1.1.2
    - scipy:             1.9.1
    - seaborn:           0.11.2
    - send2trash:        1.8.0
    - sentencepiece:     0.1.97
    - setuptools:        63.4.3
    - setuptools-git:    1.2
    - six:               1.16.0
    - smart-open:        5.2.1
    - sniffio:           1.3.0
    - soundfile:         0.11.0
    - soupsieve:         2.3.2.post1
    - spacy:             3.4.1
    - spacy-legacy:      3.0.10
    - spacy-loggers:     1.0.3
    - srsly:             2.4.4
    - stack-data:        0.5.0
    - statsmodels:       0.13.2
    - sympy:             1.11.1
    - tabulate:          0.8.10
    - tangled-up-in-unicode: 0.2.0
    - tenacity:          8.1.0
    - tensorboard:       2.10.0
    - tensorboard-data-server: 0.6.1
    - tensorboard-plugin-wit: 1.8.1
    - terminado:         0.15.0
    - text-unidecode:    1.3
    - thinc:             8.1.0
    - threadpoolctl:     3.1.0
    - tinycss2:          1.1.1
    - tokenizers:        0.12.1
    - tomli:             2.0.1
    - tomlkit:           0.11.4
    - torch:             1.12.1+cu116
    - torchaudio:        0.12.1+cu116
    - torchmetrics:      0.10.0
    - torchvision:       0.13.1+cu116
    - tornado:           6.2
    - tqdm:              4.64.1
    - traitlets:         5.3.0
    - transformers:      4.21.3
    - treeinterpreter:   0.2.3
    - typer:             0.4.2
    - typing-extensions: 4.3.0
    - ujson:             5.4.0
    - urllib3:           1.26.12
    - userpath:          1.8.0
    - virtualenv:        20.13.3
    - virtualenv-clone:  0.5.7
    - visions:           0.7.4
    - wasabi:            0.10.1
    - waterfallcharts:   3.8
    - wcwidth:           0.2.5
    - webencodings:      0.5.1
    - websocket-client:  1.4.1
    - websockets:        10.1
    - werkzeug:          2.2.2
    - wget:              3.2
    - wheel:             0.37.1
    - widgetsnbextension: 4.0.3
    - wrapt:             1.14.1
    - xgboost:           1.6.2
    - xxhash:            3.0.0
    - yarl:              1.8.1
    - yellowbrick:       1.5
* System:
    - OS:                Windows
    - architecture:
        - 64bit
        - WindowsPE
    - processor:         Intel64 Family 6 Model 151 Stepping 2, GenuineIntel
    - python:            3.10.2
    - version:           10.0.22621

More info

This is most likely specific to PyCharm, which uses a custom console. To replicate this you need to make sure you've selected "Run with Python Console" in the run config. image

Probably the most relevant bug in PyCharm is this one.

Running as a script in a terminal, it works as expected, (but I can't even see the progress bar when I use VSCode and "Run in interactive window".)

I only started looking at Lightning yesterday and it looks great, but the wall of console text makes it a bit painful to work with, since it swamps any other messages. Maybe I'll soon learn how to disable the progress meter or use the rich one. Regardless, I would suggest that if this occurs for all PyCharm users, at the very least it would be worth putting a note in the docs to lessen the bad first impression for new users.

cc @borda @awaelchli

davidgilbertson commented 2 years ago

After a bit of digging I've found what I'm sure you already know: that this is a known issue with Progress Bars. So I wonder if there's a way to detect if you're in an environment that's going to result in a wall of console text (check sys.stdout.isatty()?) and alter the output accordingly?

I notice that the tqdm source has a check for isatty(), but it only runs if disable=None. When Lightning calls Tqdm it appears to pass disable as either True or False (never None) so this check doesn't get to run. This just turns it off entirely which is not great, because it works fine if you only have one and don't interrupt it.

If you really wanted to get this working nicely for PyCharm (and other non-tty) users, you could keep track of how many progress bars you're running and check isatty() and act accordingly.

awaelchli commented 2 years ago

@davidgilbertson We have many PyCharm users here also developing Lightning, and have seen this issue from the very beginning of progress bar integration. PyCharm does not support the necessary features for tqdm to act correctly when nesting bars. More info in their FAQ here: https://github.com/tqdm/tqdm/#faq-and-known-issues. I don't know ways to work around it reliably.

If you would like to experiment with this or contribute something here, that would be great. Even a note in the docs, like you said, would already be valuable IMO.

awaelchli commented 2 years ago

FYI, we also have a second progress bar implementation using rich: https://pytorch-lightning.readthedocs.io/en/stable/common/progress_bar.html?highlight=rich#richprogressbar

davidgilbertson commented 2 years ago

Thanks for the reply.

I've just put together my own hacky little thing. I'm sure as I get to know Lightning better I'll come up with something more elegant. For now at least it prevents the wall of text.

class MyProgressBar(TQDMProgressBar):
    def init_validation_tqdm(self):
        bar = super().init_validation_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

    def init_predict_tqdm(self):
        bar = super().init_predict_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

    def init_test_tqdm(self):
        bar = super().init_test_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

trainer = pl.Trainer(
    ...
    callbacks=[MyProgressBar()],
)

And it's a sign of a well-written package that I was so easily able to get in and modify this behaviour. Good stuff!

lucifermorningstar1305 commented 1 year ago

Hey guys,

Thanks for this excellent library. I have one question regarding the RichProgressBar(). I cannot see the progress bar during the training or validation stage when using it in my notebooks. Can anyone please help me with this?

iioSnail commented 1 year ago

Hey guys,

Thanks for this excellent library. I have one question regarding the RichProgressBar(). I cannot see the progress bar during the training or validation stage when using it in my notebooks. Can anyone please help me with this?

I have same problem.

pengzhenghao commented 1 year ago

y runs if disable=None. When Lightning calls Tqdm it appears to pass disable as either True or False (never None) so this check doesn't get to run. This just turn

You can try set "leave=True"

harryseely commented 1 year ago
class MyProgressBar(TQDMProgressBar):
    def init_validation_tqdm(self):
        bar = super().init_validation_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

    def init_predict_tqdm(self):
        bar = super().init_predict_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

    def init_test_tqdm(self):
        bar = super().init_test_tqdm()
        if not sys.stdout.isatty():
            bar.disable = True
        return bar

Thank you @davidgilbertson for providing this code! It works great for me on Pycharm!

awaelchli commented 1 year ago

@harryseely @davidgilbertson If this works well, would you consider sending a PR to lightning? I'm sure the community who use PyCharm would appreciate this improvement and I'd be happy to test it out as well.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team!

sameervk commented 3 weeks ago

@davidgilbertson the issue still persists but thanks for the solution.