micasense / imageprocessing

MicaSense RedEdge and Altum image processing tutorials
https://www.micasense.com
MIT License
257 stars 152 forks source link

imageset.py refactor #148

Closed and-viceversa closed 3 years ago

and-viceversa commented 3 years ago

First cut at a imageset.py refactor, to better expose the high level code.

TODO:

poynting commented 3 years ago

I like where you're going with this, but I'm not sure I see the benefit to embedding tqdm into the imageset. Why not make it the responsibility of the caller to instantiate and update tqdm using the current progress_callback approach?

Alternatively if there are reasons to embed tqdm instead, maybe we get rid of the progress_callback approach and use tqdm in the notebooks as well, since it claims to be "friendly with IPython/Jupyter notebooks."

and-viceversa commented 3 years ago

I like this approach because the base case (use_tqdm=False and progress_callback=None) has minimal stdout and the fancy stuff can be safely ignored.

A simple progress bar only requires use_tqdm=True - in my opinion this is what majority of users find useful and it is easy to demonstrate. At least right out of the box. And it plays really well with concurrent.futures. I haven't tested tqdm in a notebook, but it's probably one of the few things that translates easily between a notebook and user code regardless of environment. The compelling thing for me is that the iterable (self.captures) is in the object so tqdm should be in the object.

Customizable handling is still accomplished by defining your own function with progess_callback.

The main difficultly is that progress_callback returns numeric, while tqdm requires an iterable.

# with tqdm integrated
file_path = '...some/path/data/'

flight = ImageSet.from_directory(file_path, use_tqdm=True)

# these parameters derived in the usual way. abreviated here.
flight.process_imageset(output_stack_directory=some_dir_stacks,
                                         output_rgb_directory=some_rgb_stacks,
                                         warp_matrices=warp_matrices,
                                         irradiance=irradiance_list,
                                         img_type='reflectance',
                                         use_tqdm=True)

# without tqdm integration. something like this...
file_path = '...some/path/data/'
flight = ImageSet.from_directory(file_path)

# tqdm kwargs
kwargs = {
                    'total': len(matches),
                    'unit': ' Files',
                    'unit_scale': False,
                    'leave': True
                }

# processing params
params = {
            'warp_matrices': warp_matrices,
            'irradiance': irradiance,
            'img_type': img_type,
            'capture_len': len(self.captures[0].images),
            'output_stack_dir': output_stack_directory,
            'output_rgb_dir': output_rgb_directory,
            'overwrite': overwrite,
        }

for cap in tqdm(iterable=flight.captures, desc='Loading ImageSet', **kwargs):
    save_capture(params, cap)
poynting commented 3 years ago

It seems like if we want to go this route, tqdm can reasonably distinguish if it's running in a notebook. I've tried tqdm in general in a notebook and it looks like a good replacement for the current progress bar.

Possibly for the first step to not immediately break compatibility we could add support for both, but issue a warning that progress_callback will be depreciated.

What do you think of this approach?

and-viceversa commented 3 years ago

Added builtin PendingDeprecationWarning on progress_callback use. A simple heuristic could be to leave the warning until the tutorials are updated + some arbitrary amount of time.

Agreed that tqdm looks good in the notebooks. I don't have much experience with notebooks, but the imageset examples worked fine for me.

and-viceversa commented 3 years ago

@poynting Please let me know what else I can do here if this isn't sufficient.

poynting commented 3 years ago

To run, it was necessary for me to add the following back in the imageset.from_directory method prior to the if use_tqdm: line:

        if exiftool_path is None and os.environ.get('exiftoolpath') is not None:
            exiftool_path = os.path.normpath(os.environ.get('exiftoolpath'))