DeepTrackAI / DeepTrack2

DeepTrack2
MIT License
157 stars 47 forks source link

Bm/migrate to torch #199

Closed BenjaminMidtvedt closed 6 months ago

BenjaminMidtvedt commented 9 months ago

This PR contains the changes for the next major release. The focus of this release is PyTorch / Deeplay integration, performance optimization, and a more rigorous way of utilizing static data from disk.

The breaking changes

Image

Pipelines no longer return Image objects per default. We chose to make this change mainly to improve performance, but also since it was prohibitively difficult to keep Image compatible with other libraries.

If you do not call .get_properties(.) or look at the .properties attribute of the output of a pipeline, this change will not affect you.

If you do need Image, you can call pipeline.store_properties(). This will make the pipeline act justlike the prior release!

Tensorflow

Tensorflow has been removed as a dependency. Maintining a tensorflow dependency is no longer feasible since they do not support major python versions on windows at all. Moreover, due to layout changes in tensorflow, uninstalling or changing tensorflow versions can leave the package system in a unrepairable state. As such, I suggest we leave tensorflow installation to the user, which will rightly direct inevitable complaints to tensorflow instead of us.

Moreover, we will not actively support any tensorflow version newer than 2.10

Instead of tensorflow, deeplay!

Moving forward, we will use the torch-based library deeplay for our models. We are still in the process of making the transition, so some functionality is yet missing. However, in the long run, we expect deeplay to provide a much more flexible and powerful base to construct neural networks from.

What's new?

Global changes

Many submodules are now lazy loaded. This means they are only actually initialized if needed. Particularly the modules containing tensorflow code. The benefits are:

Sources

Sources are a new way to operate on static datasets. They aim to solve a few common problems. Consider the following common pipeline:

train_paths = glob.glob("train/*")
test_paths = glob.glob("test/*")

train_image = dt.LoadImage(itertools.cycle(train_paths)) >> dt.NormalizeMinMax()
test_image = dt.LoadImage(itertools.cycle(test_paths)) >> dt.NormalizeMinMax()

augmented_train_image = dt.Reuse(train_image, 4) >> dt.FlipLR() >> dt.FlipUD()

Though very simple, this approach (which is the recommended approach) is actually very limited.

All to say, the current approach is not optimized for static datasets.

Introducing Sources

Sources are, in brief, a way to separate the variables of a pipeline from the definition of the pipeline. The aim is to make the pipeline (as much as possible) functional. As in, only dependent on the direct input of the pipeline pipeline(source).

To introduce, here's the above pipeline using the new syntax:

train_paths = dt.sources.ImageFolder(root="train")
test_paths = dt.sources.ImageFolder(root="test")

train_sources = train_paths.product(flip_lr=[False, True], flip_ud=[False, True])
test_sources = test_paths.product(flip_lr=[False], flip_ud=[False])
sources = dt.sources.Sources(train_sources, test_sources)

pipeline = dt.LoadImage(sources.path) >> dt.FlipLR(sources.flip_lr) >> dt.FlipUD(sources.flip_ud)

To evaluate the pipeline, we now simply do one of:

x = pipeline(train_sources[320])
# or
for source in test_sources[20:40]:
    image = pipeline(source)

if we want to iterate over paths, not augmentations, we can do

for source in train_paths:
    image = pipeline(source)

It should be clear that this solves all the issues from the current implementation. Moreover, this separation of logic allows for far more complexity, since we can define interesting ways of operating on sources that would not be possible on Features directly.

There are a few more points that I'll mention briefly. I've included a sources.NumpyRNG implementation, which is a source that can be used for seeded pipeline evaluation. Here, each source index has a unique seed.

Finally. There is a known bug. If using non-deterministic features like Gaussian, it will produce the same noise pattern for every source, unless you call .update() between evaluations. I have not decided how to solve this.

Pytorch integration

All pytorch code is currently in the lazy-loaded pytorch submodule. In the future, we might import some of it into the global namespace. Currently, we have two classes in pytorch:

pytorch.Dataset

Subclass of torch.data.utils.Dataset, which takes a deeptrack pipeline and creates a dataset that is compatible with standard DataLoaders. You either need to specify a length of the dataset, or preferrably a Source. Continuing the example from above:

train_dataset = dt.pytorch.Dataset(pipeline, train_sources)
test_dataset = dt.pytorch.Dataset(pipeline, test_sources)

pytorch.ToTensor

A convenience Feature that can (and should) be added to the end of pipelines to convert the output to pytorch tensors. Also supports setting the dtype which is critical since numpy default is float64 and pytorch (usually) expects float32.

deeplay

The deeplay library, if installed, can be accessed as deeptrack.deeplay.

BenjaminMidtvedt commented 9 months ago

Should close #172

giovannivolpe commented 9 months ago

Nice work!

Small comments:

  1. I like the sources. I agree that the noise behavior should be updated automatically.
  2. ToTensor is also good. Also good that it automatically converts to float32 (it should be the default)

About the speed enhancement, does it mean that the .get_property() method cannot be used internally, or in general? Because we are relying on it in some of the examples. If the plan is to remove, we should avoid it completely. We can discuss the details in person.

BenjaminMidtvedt commented 9 months ago

Nice work!

Small comments:

1. I like the sources. I agree that the noise behavior should be updated automatically.

2. ToTensor is also good. Also good that it automatically converts to float32 (it should be the default)

About the speed enhancement, does it mean that the .get_property() method cannot be used internally, or in general? Because we are relying on it in some of the examples. If the plan is to remove, we should avoid it completely. We can discuss the details in person.

You're right. I think torch has some api like torch.default_float() or something that returns the dtype expected by default. We can use that as the default out of ToTensor.

In general, sadly. My best case would be to remove it. But that might be too much of a breaking change.

giovannivolpe commented 9 months ago

@BenjaminMidtvedt I get this warning: DeepTrack-2.0/deeptrack/scatterers.py:100: SyntaxWarning: "is not" with a literal. Did you mean "!="? if upsample is not 1: # noqa: F632

I guess, "# noqa: F632" should be removed from line 100 of scatterers.py?

BenjaminMidtvedt commented 9 months ago

@giovannivolpe It's a frustrating warning because it is actually important that it is a "is not" instead of !=. If upsample is an array, != results in an another array instead True or False.

BenjaminMidtvedt commented 9 months ago

The noqa comment is to silence the linter about the same issue