A generalized concept of a 'SheetView' belongs in Imagen

Hi,

Topographica offers the concept of a 'SheetView' which combines a Numpy array, a timestamp and an Imagen boundingbox into a single object. This data structure only depends on concepts introduced by Imagen and may be generalized in several, useful ways.

Firstly, this concept generalizes to data that may not be a 2D Numpy array but which is embedded in a two dimensional space (specified by the bounding box). For example, you may have a collection of points or a collection of lines (i.e. contours) situated in a 2D Cartesian coordinate system. For now, let's call these possible generalizations of the 'SheetView' concept 'DataViews'.
Secondly, although data views may accept pattern arrays generated by Imagen, they only require an Imagen bounding region in their constructor. This would allow external code to use these data views as a suitable data structure. For example, a FileImage may be used to generate a Numpy image array, then fed to an edge detection algorithm (outside of Imagen!) to generates a contour data view object (in the same 2D Cartesian space as the original image). This demonstrates that data views of different dimensionality may be derived from each other. It would therefore be useful to allow data views to be linked in a parent-child relationship, grouping related data views together.
Thirdly, 'SheetViews' in Topographica have timestamps - this feature would be worth keeping in Imagen while also extending support for time in general. By default, a view is only associated with the time associated with its timestamp but it should be possible to push new data collected across time into a single data view. For example, you could push the 2D numpy arrays from a changing Imagen pattern into a single view (with the appropriate timestamp each time) to store the frames of an animation. Of course, there should also be facilities for retrieving a frame with a specific timestamp or a group of frames from a range of times (i.e a slice).
Finally, a DataView may also have an optional 'position' argument (or similar) to indicate whether the data is associated with a specific particular coordinate position in the N-dimensional space ( a sample position with the defined by the bounding box). For example, if data views are allowed to accept either Numpy arrays or other data views as input data, you could have a connection field expressed as a unit position in the destination sheet with data (weights) situated in the source sheet coordinate system. In Topographica this is currently implemented using UnitViews. This may not be the best solution and it might be best to implement multiple data view classes instead.

There are many other extensions to this idea that may be useful. All data views could have an optional 'value_bounds' parameter that constrains the data values between a minimum and maximum value. There may be a need a for a 'bounding line' concept for data situated in a 1D space.

All these ideas are fairly tentative at the moment - I just wish to show that 1) there is a valid concept that belongs in Imagen 2) that this concept may be extended in several useful ways 3) these proposed Imagen classes should replace SheetViews (and related classes) in Topographica.

Jean-Luc

holoviz-topics / imagen

A generalized concept of a 'SheetView' belongs in Imagen #16