voxel51 / fiftyone

Refine high-quality datasets and visual AI models
https://fiftyone.ai
Apache License 2.0
8.8k stars 556 forks source link

[FR] Allow disabling of progress bar when adding samples to dataset #2436

Closed sebimarkgraf closed 9 months ago

sebimarkgraf commented 1 year ago

Proposal Summary

Allow a disabling of the progress tracking when adding samples. The progress bar could interfere with other output or progress bars and it should therefore be possible to disable the progress bar.

import fiftyone as fo

samples = ...
dataset = fo.Dataset()
dataset.add_samples(samples, progress=False)

Motivation

What areas of FiftyOne does this feature affect?

Details

The implementation could be done by forwarding the parameter to the add_samples signature and setting the default to the current value.

Willingness to contribute

The FiftyOne Community welcomes contributions! Would you or another member of your organization be willing to contribute an implementation of this feature?

brimoor commented 1 year ago

This is already supported via the fiftyone.config.show_progress_bars setting, which defaults to True.

You can disable progress bars in the current session like so:

import fiftyone as fo

fo.config.show_progress_bars = False

dataset = fo.Dataset()
dataset.add_samples(samples)

or permanently by modifying your config.

You can also temporarily disable progress bars using this pattern:

import fiftyone as fo
import fiftyone.core.utils as fou

with fou.SetAttributes(fo.config, show_progress_bars=False):
    dataset = fo.Dataset()
    dataset.add_samples(samples)
brimoor commented 1 year ago

We could also consider adding an add_samples(..., progress=True/False) parameter that controls whether to show progress on a per function call basis.

If we do that, I'd want to implement it consistently across the codebase, as there are a number of other methods that print progress bars. Here's a probably-incomplete list:

foz.load_zoo_dataset()
Dataset.add_samples()
Dataset.add_XXX()
Dataset.from_XXX()
SampleCollection.compute_metadata()
SampleCollection.apply_model()

Notably, the SampleCollection.iter_samples() method is not consistent with the above methods. It does have a progress=True parameter. But it is implemented in such a way that progress=True has no effect if fo.config.show_progress_bars == False:

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

fo.config.show_progress_bars = False

# No progress is printed
# `fo.config.show_progress_bars` takes precedence over `progress=True` 
for sample in dataset.select_fields().iter_samples(progress=True):
    pass

It would be nice to unify the behavior of iter_samples() with the other methods if we're going to add more formal support for a progress=True option in the codebase.

The reason we initially chose to implement a global fo.config.show_progress_bars setting rather than exposing per-function progress=True/False options was to remove the need to wire up the many methods that utilize progress bars to pass the progress parameter all the way down to the actual progress bar. But, I could see value in allowing progress=True/False on a per-method basis to override fo.config.show_progress_bars. To achieve this, I would probably default progress=None so that we could have:

sebimarkgraf commented 1 year ago

That definitely makes sense and would be a nice use case to set a default and be able to overwrite it per method. I will adjust the PR to unify the behaviour of these methods.

Does it make sense to test the behaviour somehow or do we already have similar tests somewhere?

brimoor commented 1 year ago

Automated tests are in tests/unittests. The methods themselves should all be tested, but there aren't tests of whether progress bars render or not. To test that I guess you would need to monkey patch the fiftyone.core.utils.ProgressBar class so that you can check if the appropriate value for quiet makes it here: https://github.com/voxel51/fiftyone/blob/009c589d1805891e2d0b4c22b0189d8d4e1e94f3/fiftyone/core/utils.py#L825

In fact, the tests use the existing pattern I demonstrated above to disable progress bars, as they were messing with logging or execution (I forget): https://github.com/voxel51/fiftyone/blob/009c589d1805891e2d0b4c22b0189d8d4e1e94f3/tests/unittests/dataset_tests.py#L4132

sebimarkgraf commented 1 year ago

Would it make sense for the implementation to cleanup the Dynamic batcher and then just feed the correct value to the quiet kwarg? This would make all progress bars use the same class under the hood and therefore lead to consistent behaviour.

Currently, this splits the logic between the ProgressBar implementation and the DynamicBatcher which both handle this differently.

brimoor commented 1 year ago

All progress bars are ultimately rendered by ProgressBar, but the DynamicBatcher is an example of a util class that uses ProgressBar to render its progress.

I think what needs to happen is that all progress parameters need to default to progress=None so that fo.config.show_progress_bars can be used inside ProgressBar.__init__() to set the default.

It is annoying/unfortunate that the ProgressBar class uses quiet, which has the opposite meaning of progress.

brimoor commented 1 year ago

One important constraint here is that I'd really like to avoid any breaking changes to the public API. So, for example, we cannot rename iter_samples(progress=) to iter_samples(quiet=).

The most we could do is add a progress= kwarg to ProgressBar so that all callers could pass progress= instead of converting to quiet.

By the way, I've been close to acting on this feature request myself a few times, but each time I took in all the constraints and decided that the existing configuration options are fine for now...

sebimarkgraf commented 1 year ago

A possible fix for this would be something like this:

class ProgressBar(etau.ProgressBar):
    def __init__(self, *args, progress=None, **kwargs):
        if "quiet" in kwargs:
            # Allow overwrite with expected progress attribute
            progress = not kwargs["quiet"]

        if progress is None:
            # Use global config value
            progress = not fo.config.show_progress_bars

        kwargs["quiet"] = not progress

That way the progres bar can be used as expected, but we can use the progress parameter for all internal parameters. If the "quiet" kwarg is specified, it takes precedence. Therefore the priority would be:

  1. Kwarg "quiet"
  2. Kwarg "progress"
  3. Global setting "fo.config.show_progress_bar"
brimoor commented 9 months ago

This feature has (finally 😅) been completed in https://github.com/voxel51/fiftyone/pull/3979 and is available as of fiftyone==0.23.3!