plenoptic-org / plenoptic

Visualize/test models for visual representation by synthesizing images.
https://plenoptic.readthedocs.io/en/latest/
MIT License
61 stars 9 forks source link

Pooled texture model #284

Open billbrod opened 3 months ago

billbrod commented 3 months ago

This pull request adds the pooled texture statistic model from Freeman and Simoncelli, 2011 and elsewhere. In our implementation, the statistics are computed with user-defined masks, which is a list of 3d tensors of shape (masks, height, width). The list may contain any number of tensors, and they will be multiplied together to create the final pooling regions. The intended use case is to define the regions as two-dimensional and separable, either in x and y, or polar angle and eccentricity.

This pull request contains a notebook, pooled_texture_model, which demonstrates the usage of the model with different images and types of windows.

This pull request is not yet ready; the following questions still need to be addressed:

Math / model details:

Software design:

User interface:

Testing / relationship to other models:

review-notebook-app[bot] commented 3 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 15.52795% with 272 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ptic/simulate/models/portilla_simoncelli_masked.py 14.73% 272 Missing :warning:
Files with missing lines Coverage Δ
src/plenoptic/simulate/models/__init__.py 100.00% <100.00%> (ø)
...c/plenoptic/simulate/models/portilla_simoncelli.py 97.76% <100.00%> (+0.35%) :arrow_up:
src/plenoptic/tools/conv.py 100.00% <100.00%> (ø)
...ptic/simulate/models/portilla_simoncelli_masked.py 14.73% <14.73%> (ø)

... and 1 file with indirect coverage changes

ershook commented 1 month ago

Adding thoughts to questions/comments inline below:

Currently, both of these approaches are supported -- should they be? Or should we require the user to pass two masks that we multiply together? I can't think there's any case in which someone would pass a list with three or more mask tensors, but that would work as well. Even if we allow a list of masks, should we allow more than two?

At least for me I don’t see a need of having more than two mask tensors or two masks more generally. Although I do think for simplicity allowing either format of mask input may be useful. I guess for large images is where using the product of the two masks will be important — I can imagine a world where I will want to measure statistics on larger images where this will be important.

Before moving on, another question: how to handle mask normalization? The model works best if the individual masks (as shown above) sum to approximately 1 because, otherwise, some of the statistics end up being much larger than the others and optimization is hard. Currently, we're not normalizing within the model code, but should we?

I would normalize within the model code — it’s one of the things that I could imagine a first time user forgetting even if you make it very explicit or at the very least have a warning in the case that its not normalized that reminds the user this is an issue.

For the pooling regions, you want to avoid aliasing. This is an interaction between the sampling distance and the function used to generate the regions, and can be checked by seeing if they're equivariant to translation (if the windows are Cartesian) or rotation / dilation (if the windows are polar). I don't think the model should check this, but we can show an example of how to check this in our documentation and point out that this is important.

This seems reasonable to me.

The foveated pooling windows are not currently in plenoptic and are a pain to implement yourself. Pytorch implementations exist (as well as ways to get existing windows into pytorch), including one that I wrote. I can show examples making use of some of these, is that sufficient? None of them are python packages (so they can't be installed with pip), but I could at least package the one I wrote at least. I've hesitated because it's definitely research code, but it would simplify the process for people.

I have always found using the pooling-windows repo straightforward so I think it is sufficient. You could package it but cloning has always been easy to get running for me.