ivoa-std / VOUnits

Units in the VO
Creative Commons Attribution Share Alike 4.0 International
0 stars 4 forks source link

For discussion: how to annotate quantities with the semantics of "electrons per second" #35

Open gpdf opened 9 months ago

gpdf commented 9 months ago

It appears to be common to discuss low-level pixel data from infrared imaging sensors as having "engineering units" of "electrons per second". (Higher-level data products are often ultimately flux-calibrated.)

The SPHEREx instrument and science teams, with which I work, have specified this as the units for the Level 1 images from the mission. It's intended to be understood as a quantity having the dimensions of electric current. It's often measured, in infrared detectors, by a "sample up the ramp" technique that fits a slope to a series of measurements.

I understand from @jwfraustro that the Roman Space Telescope project also wishes to produce data with "electrons per second" as the pixel data unit in images.

"In the field" we see projects using the string "e-/s" in FITS BUNIT and, therefore, also see this value appearing in ObsCore's o_unit attribute. Neither FITS 4.00 nor VOUnits provide for this specific syntax or any other explicit way to denote units with this meaning. That means that data with this string will fail validators and may need workarounds to suppress these validation errors.

Users would very much like data-discovery and image-display UIs to display something like "e-/s" on the screen. However, that doesn't necessarily mean that the machine-readable form of the units must be exactly that string.

It would be great if we could find a standards-compliant way to meet the projects' needs in this area.

Two possibilities not involving creating a newly allowed unit that have been suggested:

Baptiste has also pointed out "electron" is not per se a unit of measurement, so there's something quite wrong about adding it as a first-class element of the units model. Even though many of us might intuitively think of "electrons per second" as likely to be conceptually a current, "electron" by itself is not a unit of charge (or mass, or any other property) and couldn't legitimately be used as a unit in its own right. So it's not at all obvious how a string like "electron/s" would fit into the units model.

This is not a request for a specific change but rather for feedback / discussions / suggestions.

Interested parties: @bcrill @tatianag @jwfraustro

BaptisteCecconi commented 9 months ago

In space physics, we have the same use case, when we need to describe number density of electrons in space plasma. What we do is set the unit to s**-1 and the variable name (or label name) contains the information that this is an electron number density.

However, section 2.11 of the proposed recommendation allows to have unknown units, if they are between single quotes. So a unit like 'electron'/s is valid.

nxg commented 9 months ago

I would agree with Baptiste, that ‘electron’ isn't a unit of measurement – it doesn't have M/L/T/etc dimensions (‘coulombs per second’ does, of course, but that's not what's being recorded). Thus the units of ‘electrons per second’ would indeed be s**-1, with an annotation somewhere, such as in an associated UCD, indicating what's being counted.

Dimensionless units, such as percent or radians, or scaling factors such as 1e3 or 1.898e27kg (Jupiter mass), are unexpectedly slippery to handle, and there was quite a lot of to and fro when discussing the VOUnits model, which resulted in the model remaining fairly unambitious. There is a % unit, but that was included only because it seemed perverse to exclude it, since it was so very widespread in practice. That of course conceptually opens the door to all sorts of scaling factors other than 1e-2, but no other such factors are happily countenanced.

As Baptiste also notes, the recommendation permits unknown units, so as a practical matter, electron/s would still parse (e-/s doesn't), so if you want to do that, no-one's going to stop you! Baptiste's 'electron'/s would be better, partly because it implicitly flags 'electron' as special in some sense, and also in principle, because the standard notes that, for unquoted unknown units, SI prefixes are still recognised, so (standard example) a furlong is a femto-urlong whereas a 'furlong' is a furlong.

jwfraustro commented 9 months ago

I had the chance to speak again this morning with Tyler Desjardins @tddesjardins this morning, who originally brought the issue to my attention. I feared I might have missed some parts of conversation when relaying it to @gpdf second-hand--- it has been some years since my last class on CCD astronomy...

My conversation with Tyler re-summarizing our discussion:

…the context was that for HST and now Roman (so far...discussions are happening), we are storing the data in (photo)electrons / second. So those are instrumental units AFTER the gain conversion has been applied. This is useful because it is electrons that are related to the Poisson noise of the detectors, not DN (also called ADU). DN are what are reported out from the detectors, but that is a measurement of the free electrons in a pixel.

....I think the key is that these are "photoelectrons." Although I'll point out that astropy.units.astrophys module does contain a unit of "electron" alongside the units ADU and DN.

mbtaylor commented 9 months ago

I note that the unit count is available in the VOUnits syntax, so using that is one possibility. But perhaps it is not sufficiently specific for your purposes.

tddesjardins commented 9 months ago

Hi Mark, someone (maybe Brian McLean?) had suggested count as well, and you're right we decided that wasn't really sufficient. Since we're using ASDF files for our science data, we can store the arrays as astropy quantity objects so they are self-describing in terms of units. But ASDF only accepts VO standard units (our devs have done some kind of workaround for the files, but it led to some other complications with catalogs and VO standards downstream), and we'd like to have them actually say they are in units of electrons / second.

gpdf commented 9 months ago

I think if HST and Roman and SPHEREx are all saying "we will have images in these units", even though there are acknowledged conceptual concerns, it would be very helpful if we could converge on a documented convention that eventually DS9, Firefly, Aladin, Ginga, Astropy, etc. could implement in order to display these units meaningfully.

I think that despite the formal concerns which we've discussed above, and which I acknowledge are real obstacles to accepting "electron" by itself as a unit of measurement in the standard, there's no substantive ambiguity about the meaning of the measurements these projects are reporting.

Since the other projects seem to have already decided that the "counting" solution doesn't meet their needs either, I now see the following possibilities:

gpdf commented 9 months ago

I want to add to the above: the point of finding a standards-based solution to this is to avoid pushing all these projects into turning off validators on the units that they use, and as @tddesjardins say, having to do this through an entire toolchain, since the units of pixels in images eventually show up in catalogs, in o_unit in ObsCore, and so on.

mbtaylor commented 9 months ago

I'm not pushing this in either direction, but I note that VOUnit does have a unit "photon" which is somewhat conceptually similar to "electron" (except that while electron can be converted numerically to an SI unit, that doesn't obviously apply to photon).

nxg commented 9 months ago

The following are some fairly disjointed, partly historical, remarks. They do come to a conclusion, though.

I am indeed advising against scaling factors. The example that came up in the standardisation discussions was the unit of 'jupiterMass' vs the unit of 1.90e27kg. Should the latter be taken to be the same as 1.8982e27kg? What if you add more precision later? What if you re-measured the mass and got a different answer? How do you test for equality? Each of those questions has multiple plausible answers, but the mere possibility of asking them adds complication to the standard.

As @mbtaylor points out, there is a photon unit, but that's one of the units (like %) which were grandfathered in simply on the basis of the number of pre-existing databases which were found to use them.

The units don't, and indeed can't, tell you everything about a quantity. For example, a pressure, if you choose to write it in base units, is kg.m**-1.s**-2; an energy density, in contrast, is kg.m**-1.s**-2. Both of those are slightly eccentric ways of writing the units, but not wrong. In each case, it's metadata somewhere else – perhaps attached to the table, or elsewhere in the message, or simply implicit – that tells you what it is legitimate to do with the number annotated with these units. UCDs stand ready to help with this orthogonal question, slightly raggedly, but practical and adaptable.

It would be possible to talk about a more sophisticated object, a ‘quantity’, which bundles up this extra information, and people have. But that conversation rapidly gets out of control, in terms of what extra information?, how expressed?, controlled by which standard?, what do you actually want to do with these quantities? In the units discussion it seemed useful to keep well clear of those questions of semantics (and this is me saying this!), and keep the question focused as much as possible on ‘what unit strings are legal and what aren't?’ In doing so, VOUnits avoided revisiting what IVOA old-timers may remember as The Quantity Discussion – a summer of criss-crossing and occasionally bad-tempered 1000-word emails, which came to precisely no conclusions.

Another case where the VOUnits spec carefully avoids asking a question is that, while it is explicit that s means the second, it deliberately avoids making any commitments about which type of second it is. That's left to metadata, and context.

So for this case, I'd strongly suggest that the units be indicated as count/s, and that metadata elsewhere in your table/message/archive be relied upon to confirm just what is being counted, and how it should appear in a pretty axis label. That's a division of labour which goes with the grain of the VOUnits standard.

gpdf commented 9 months ago

@nxg I'm not arguing against other aspects of your point, but the analogy with the Jupiter mass is not really applicable.

The relationship between the electron charge and the ampere/coulomb is exact by construction: one SI coulomb is defined to be (5 * 10^27)/801088317 electron charges.

If we used a scale factor and amperes, there would be a precise spec for it based on this. If a data provider chose to round it off to fewer digits, it would certainly be up to, e.g., DS9, to choose whether or not to display it verbatim or to translate it to $e^{-}/s$ for display, but a) that wouldn't affect what was in the actual data file and b) realistically I think it's infinitesimally likely that a data provider providing a slightly different value would be deliberately attempting to convey a concept different from "electrons per second".

On the other side of the scale, while all of the projects involved are not thrilled about the "count/s" alternative, it does map to the underlying Poisson process and the implied uncertainties associated with it.

nxg commented 9 months ago

@gpdf Indeed: the jupiterMass vs scaling factor issue doesn't really apply in this case. And in the other similar examples that emerged, there was always one or more interpretation that made sense and that made the ambiguity not a problem.

I mentioned it as illustration of the way that those involved in the VOUnits process realised (in my recollection) how close they were to a variety of messy auxiliary questions, and thus what the motivation was for keeping the scope of the standard as narrow as possible. The standard's goals are more modest than they may at first appear.

Picking up on your earlier point, I don't think anyone should feel pressure to turn off a validator:

% ./unity -ivounits -v electron/s
0 0/electron/1 0/SecondTime/-1
check: all units recognised?           no
check: all units recommended?          no
check: all units satisfy constraints?  yes

The string electron/s is a valid VOUnits string, it's just that the unit electron isn't given a gold star, as being one of the recommended standard units. The spec doesn't say ‘unrecognised’ == ‘bad’, just ‘unrecognised’ == ‘unrecognised, but you knew that, right?’, and ‘it'd be nice to stick to recognised units if you can’.

mbtaylor commented 9 months ago

For humans reading Unity validation output you might be right, but at least one way that the validation is done is to feed all the units in a large set (Vizier has almost a million) to a validator and look at the ones that don't pass. In practice many might be syntactically valid unrecognised units that are unintentionally wrong, e.g. "sec" for seconds, and having the same validator behaviour for those as for syntactically valid unrecognised units that are intentionally used e.g. "electron/s" means that you'd either have to eyeball all the (many) "unrecognised" instances to identify the actual problems or just assume they're all as intended, which is not likely to be true.

Requiring rather than allowing quotation of unrecognised units would make distinguishing those cases much easier, but it would be a significant change to the VOUnits standard.