Open minnerbe opened 10 months ago
I'd like to loop in @virginiascarlett too.
Ad 4: I like the definition of sample being tied to the actual creation of the data, but it seems that all your suggestions have a particular application in mind. What about "the smallest subunit of data acquisition, e.g., the energy sensed by a particular cell on a sensor"? I'm not familiar enough with all the methods of data acquisition, but this would also cover output of post-processing steps (e.g., a single location from a superresolution image).
Thanks for a productive chat today over coffee. ☕️ On second thought, you're right, we should create a PR so we can see the diffs. I see Michael already has a restructure-definitions
branch, and it feels messy to me to create yet another branch for definitions... what's the deal with that branch? Maybe you could create a PR, Michael? 🙏
Here's what I have now:
Field of view: The physical extent of the observed space. In microscopy, FOV may be expressed as the diameter of the circular view seen through the eyepiece. In scientific bioimaging, FOV is typically expressed as the horizontal, vertical, and/or diagonal extent of the space captured by the digital sensor. For example, the FOV for a 2D image may be 44mm by 22mm, where 44mm is the length and 22mm is the width of the observed space.
Image: A set of numbers intended to be displayed on a screen. It is also a set of values, each of which is associated with a position in a domain. Ancillary data structures may be required to display or interpret an image (such as a lookup table), but these are not part of the image itself. An image is often, but not necessarily, acquired by a sensor situated within an optical system. Images can be represented in compact forms, for example as a compressed sequence of bytes or as a discrete function over a finite domain, but these are not canonical uses of the word “image”, and the word “image” by itself typically refers only to arrays and array-like data structures.
Origin: A special location that acts as a reference point, relative to which other locations are defined. Unless otherwise specified, the image's origin is the same as the array's origin (assuming the image is an array). An array's origin is typically the point in the discrete domain with the minimum index (usually zero) for all dimensions. Physical or anatomical spaces can also have origins; for example, in MR imaging, the anterior/posterior commissure is commonly regarded as an origin for the brain. The term "offset" is sometimes used to refer to the origin. ^Note: Say something about a crop of an image?
I'll reach out to you about another chat ☕️ so we can review these and discuss 'sample'.
We may be confusing an image with its representation as a digital discretized array. This is like confusing a map of the terrain with the physical terrain itself. In some cases, we have purposely exploited this confusion - ImgLib2 (Image Library) for example is really a library that deals with n-dimensional arrays in Java.
Typically in this context we are talking about an image that has been produced by some kind of physical instrument, but generally we are talking about some n-dimensional signal that is bandlimited. A signal's band limit is related to its resolution and information content. A band-limited signal, a physical image, only contains details up to a certain point. Technically this means we cannot measure the amplitude of waves beyond a particular (spatial or temporal) frequency. Higher frequency waves encode more detail.
Images acquired by a physical instrument - say an optical microscope - are inherently bandlimited due to physical constraints of how much information can be collected. A lens for example must be finite in size and detectors can only measure particular wavelengths of light.
A digital array is one way to record an image. Analog film is another method for recording an image. Both a digital array and film are representations of the image but is not the image itself. In particular, the image itself is not discrete. Because the image is band-limited there is a way to exactly interpolate a number of discrete samples to reconstruct the signal. The relationship between a signal's resolution and the number of discrete samples required for reconstruction is governed by the Nyquist-Shannon Sampling Theorem [1]. From the reconstructed signal or image, we can obtain a value at any point (e.g. (3.3535, 5.1032)
) within the field of view not just a discrete points (e.g. (3,5)
).
In summary, an image is an abstract notion distinct from its representation as a discrete digital array.
[1] C. E. Shannon, "Communication in the Presence of Noise," in Proceedings of the IRE, vol. 37, no. 1, pp. 10-21, Jan. 1949, doi: 10.1109/JRPROC.1949.232969.
Side note: The Nyquist-Shannon Sampling Theory is commonly misinterpreted and misapplied. Often, microscopists may say things such as "my microscope has a resolution of 200 nm so I need to have a pixel size smaller than 100 nm". The problem is that Shannon in "Communication in the Presence of Noise" refers to ideal periodic signals where it makes sense to capture acquire equispaced samples just beyond the "Nyquist Limit" (e.g. 100 nm in the example above). Equispaced samples only make sense in the context of periodic signals. Also the presence of noise increases the need for additional samples beyond the "Nyquist sampling".
For bounded signals (e.g. images with a field of view), we need more samples located near the boundary of the the signal than in the center of the signal. Practically, we also need to sample at least 3x the band limit frequency, and I would argue we need about 6x the band limit for equispaced samples.
As discussed, I created a PR (#11), which will receive updates as this discussion progresses.
Thanks, @mkitti for that insightful comment! I agree that it seems we ran into the terrain/map issue. Part of the reason why is that we are still not sure about the scope of these definitions, I believe.
Ultimately, these definitions should provide an umbrella-framework for the multiple different data formats we encounter in the wild. From this point of view, maybe a sufficient definition of image is, in fact, "an array with metadata about acquisition and/or visualization". This would allow us to clearly separate the two notions:
In this context, your comment would be a very good starting point for developing such a domain language.
Thanks @mkitti !
Regarding the definition of image, adding clarifications for discrete / digital and continuous images will be important.
It's not totally clear to me how to modify the definition based on your comment, but it sounds like you'd want image to mean only (?) an (idealized) continuous signal; suggesting that the sampling of it is "just" some re-presentation of it. My strong preference would to be permissive in the definition both because (these days) the common usage of the term refers to digital images (collections of samples), and because the usage in the standards documents here will most often refer to the digital images. As well, that might seem to readers to imply that discrete representation as "second-class" somehow, and I'd like not to give readers that perception.
In my view, these definitions have two purposes:
We should discuss these.
Because these documents are primarily about storing images as files on computers, they will be about digital images. I expect most occurrences of "image" in the documents we write will refer to digital images on regular grids, and having the primary definition refer to exactly that will be good.
However, because of purpose (2), it will be valuable to have definitions and/or discussions relating to image reconstruction and how that relates to sampling + the Nyquist limit. I've been avoiding those terms...
It would be really great if you're willing have a go at writing some definitions on those topics!
One last thing:
Both a digital array and film are representations of the image but is not the image itself.
I would have phrased this: "Both a digital array and film are images but not the physical object itself."
If I have one sampling of an image and I know the point spread function, I could then upsample or downsample the image without aliasing. Are the resampled arrays new images in and of themselves? Or are they just new representations of the same image?
A high resolution image will have more samples than a low resolution image for the same field of view.
This statement is problematic. A low resolution image could have as many samples as a high resolution image if it were oversampled.
A high resolution image requires more samples than a low resolution image in order to faithfully retain the information ("level of detail") within, but otherwise the relationship between resolution and the number of samples is not direct. I can increase the number of samples all that I want, but it will not actually increase the resolution of the image.
Overall, I think we are confusing technical concepts with their colloquialisms.
- The level of detail in an image. A high resolution image will have more samples than a low resolution image for the same field of view.
- The total number of samples in each dimension of an image. For example, a 2-dimensional image with pixels along the x dimension and pixels along the y dimension could be said to have a resolution of (the colloquial convention is to express the dimensions in x, y, z order).
- The set of physical (usually spatial) sampling intervals for an image. In other words, the distance between samples. Usually expressed separately for each dimension, e.g. millimeters per pixel in x. (Note: The sampling interval is the reciprocal of the sampling rate.)
Definitions 2 and 3 above are colloquial uses of the term resolution, but only Definition 1 is the correct technical definition.
Because these documents are primarily about storing images as files on computers, they will be about digital images. I expect most occurrences of "image" in the documents we write will refer to digital images on regular grids, and having the primary definition refer to exactly that will be good.
A rastered sampling of an image on a regular grid is one of many ways to store a digital image as a file, but I would argue that most images are not stored that way. JPEG uses a DCT representation. PNG uses Adam7 interlacing followed by DEFLATE compression.
We can discuss the colloquial meanings of the words, but I think we should be very clear what we mean since we are writing a technical document.
"resolution" is used in confusing ways -- colloquially, we use it to describe the sampling density of a measurement but as Mark notes, in signal processing terms a signal's resolution is determined before you sample it. You can acquire a trillion samples of a defocused optical image, but the result will still be defocused (i.e., low resolution).
I would suggest we avoid using the word "resolution" to describe the number of samples in an image / the spacing between samples. We have other ways to describe this property. Like the word "pixel", we should probably note that there is a colloquial use of the term, but also note that such usage is potentially confusing and should probably be avoided wherever possible.
As a practical consideration, how do we incorporate microscopy data that is not primarily described by an array of pixel values?
Data from Single Molecule Localization Microscopy (SMLM) is often given as a series of points. Examples of SMLM include PALM, STORM, PAINT, and MinFlux.
Combined with a radial basis function, these could be rastered as an array of pixel values, but that is not their primary representation. Does our definition of "image" require this kind of data to be rastered in order to become an "image"?
Another example could be some compressed sensing application where the phase and amplitude of particular spatial frequencies are measured using a microlens array. We see this kind of application already in FourierNets and certain kinds of holography.
My practical suggestion is to maintain the term "image" as an abstract notion of a general n-dimensional signal while qualifying its representations. For example, when we have an array representation we are often speaking of a "rastered image" or a "digital image".
While reading through the definitions collection, I came across some parts that need clarification. Since, after personal discussion with @bogovicj, it wasn't immediately clear to us how to resolve these issues, I'm opening them up for general discussion: