SasView / sasview

Code for the SasView application.
BSD 3-Clause "New" or "Revised" License
51 stars 41 forks source link

Split nxsunit into a separate package #2035

Open pkienzle opened 2 years ago

pkienzle commented 2 years ago

Options for nxsunit:

I'll implement the last option. The simulation code can set up the vectors in the target units directly.

If you are using sasmodels to fit measured data you presumably have the data loader on the path so the unit conversion code will be available. So long as nxsunit stays with the data loader the lazy loading will work.

Splitting nxsunit into a separate package is worthwhile. Any generic NeXus file reader will need something like this to handle the idiosyncratic units that can appear. The code was originally written for the nexus api wrapper here. Another variant is in the NCNR reductus package here. Maybe combine it with other NeXus data read/write utilities.

Originally posted by @pkienzle in https://github.com/SasView/sasmodels/issues/465#issuecomment-1105468338

butlerpd commented 2 years ago

There already is a repo started for the dataloader. I would think nxsunits will be part of that? I'm not sure it makes sense to have to granular a set of package -- more importantly it is not clear that we have the resources to maintain too many independent packages?

pkienzle commented 2 years ago

I created a new repo on github (scattering/nxsunit).

Test using: pip install git+https://github.com/scattering/nxsunit.git

I'll put it on pypi after it is reviewed.

lucas-wilkins commented 2 years ago

Regarding temperature conversion. There is a fundamental problem when attempting this: there is nothing in the standard numerical representation that says whether we're talking about differences in temperatures or absolute temperatures. For example, if I have a temperature difference of, say, 2degF, running the conversion would interpret this as a 250K difference.

My solution would be have two conversions: convert_relative, convert_absolute. One could even be just called convert.

lucas-wilkins commented 2 years ago

I have code that does dimensional analysis stuff, and partially implements a unit formatting system. Maybe in the future we could integrate it. Or, perhaps we migrate to an existing units package?

lucas-wilkins commented 2 years ago

There are other unicodes for degree symbols - or things that would be used as them - you've got a couple of them, but there are more

lucas-wilkins commented 2 years ago

Isn't there a nxunits database with all the unit information in it?

pkienzle commented 2 years ago

Regarding temperature conversion. There is a fundamental problem when attempting this: there is nothing in the standard numerical representation that says whether we're talking about differences in temperatures or absolute temperatures. For example, if I have a temperature difference of, say, 2degF, running the conversion would interpret this as a 250K difference.

My solution would be have two conversions: convert_relative, convert_absolute. One could even be just called convert.

My method for handling ambiguous units is to specify the dimension to use during the conversion. In this case I could add a delta_temperature dimension to scale the change. For example:

dT = convert(10, 'F', 'C', dimension='delta_temperature')
pkienzle commented 2 years ago

I have code that does dimensional analysis stuff, and partially implements a unit formatting system. Maybe in the future we could integrate it. Or, perhaps we migrate to an existing units package?

This package is explicitly not doing dimensional analysis. It is reading from the NeXus files using somewhat obscure unit names that can appear therein. It could be used to load values into a dimensional analysis system putting everything into standardized units. Alternatively, if you want to preserve the units in the file, then we could restructure the package into a two step process: (1) translate unit names into standardized names and (2) convert to target units, with the dimensional analysis system using step (1). For now I'll leave it as is.

pkienzle commented 2 years ago

There are other unicodes for degree symbols - or things that would be used as them - you've got a couple of them, but there are more

Wikipedia only mentions the three that I used. If you know of NeXus files that use other symbols please report it as an issue.

pkienzle commented 2 years ago

Isn't there a nxunits database with all the unit information in it?

Here is what the NeXus standard says about units:

Given the plethora of possible applications of NeXus, it is difficult to define units to use. Therefore, the general rule is that you are free to store data in any unit you find fit. However, any data field must have a units attribute which describes the units, Wherever possible, SI units are preferred. NeXus units are written as a string attribute (NX_CHAR) and describe the engineering units. The string should be appropriate for the value. Values for the NeXus units must be specified in a format compatible with Unidata UDunits Application definitions may specify units to be used for fields using an enumeration.

Looking at UDunits there are some that I've seen in NeXus files that are not in UDunits and many that are in UDunits that are not in NeXus files. Units in NeXus files that are not supported by nxsunit should be reported as a bug.

Or maybe you are noting that I'm constructing the database on the fly in python rather than having some sort of declarative language to represent them in a single table on disk.

lucas-wilkins commented 2 years ago

I saw a units package (which I thought was called NX something, seems I'm probably mistaken) that in part was what appeared to be a pretty comprehensive list of units and their relations in XML format.

I don't see a problem with having them all hard coded, just that there's a lot to code if you want to make it comprehensive.

On Fri, Jul 22, 2022 at 9:38 PM Paul Kienzle @.***> wrote:

Isn't there a nxunits database with all the unit information in it?

Here is what the NeXus standard says about units:

Given the plethora of possible applications of NeXus, it is difficult to define units to use. Therefore, the general rule is that you are free to store data in any unit you find fit. However, any data field must have a units attribute which describes the units, Wherever possible, SI units are preferred. NeXus units are written as a string attribute (NX_CHAR) and describe the engineering units. The string should be appropriate for the value. Values for the NeXus units must be specified in a format compatible with Unidata UDunits Application definitions may specify units to be used for fields using an enumeration.

Looking at UDunits there are some that I've seen in NeXus files that are not in UDunits and many that are in UDunits that are not in NeXus files. Units in NeXus files that are not supported by nxsunit should be reported as a bug.

Or maybe you are noting that I'm constructing the database on the fly in python rather than having some sort of declarative language to represent them in a single table on disk.

— Reply to this email directly, view it on GitHub https://github.com/SasView/sasview/issues/2035#issuecomment-1192917912, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKU4SRJIKD23XXTI2I6LHLVVMBCVANCNFSM5UAFHXUQ . You are receiving this because you commented.Message ID: @.***>

--

Dr Lucas Wilkins +44 (0) 7505 915 726

Personal Website: http://www.lucaswilkins.com/ Alternate e-mail: @.***

lucas-wilkins commented 2 years ago

In one of the issues I was dealing with I wanted to have units that were derived from others, this doesn't work without a pretty deep understanding of how units work. Definitely not saying it needs to be done, but something I've been thinking about for a while.

On Fri, Jul 22, 2022 at 9:06 PM Paul Kienzle @.***> wrote:

I have code that does dimensional analysis stuff, and partially implements a unit formatting system. Maybe in the future we could integrate it. Or, perhaps we migrate to an existing units package?

This package is explicitly not doing dimensional analysis. It is reading from the NeXus files using somewhat obscure unit names that can appear therein. It could be used to load values into a dimensional analysis system putting everything into standardized units. Alternatively, if you want to preserve the units in the file, then we could restructure the package into a two step process: (1) translate unit names into standardized names and (2) convert to target units, with the dimensional analysis system using step (1). For now I'll leave it as is.

— Reply to this email directly, view it on GitHub https://github.com/SasView/sasview/issues/2035#issuecomment-1192898278, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKU4SUFWACVBRJ4FHRX75DVVL5M5ANCNFSM5UAFHXUQ . You are receiving this because you commented.Message ID: @.***>

--

Dr Lucas Wilkins +44 (0) 7505 915 726

Personal Website: http://www.lucaswilkins.com/ Alternate e-mail: @.***

lucas-wilkins commented 2 years ago

OK, sounds good

On Fri, Jul 22, 2022 at 8:53 PM Paul Kienzle @.***> wrote:

Regarding temperature conversion. There is a fundamental problem when attempting this: there is nothing in the standard numerical representation that says whether we're talking about differences in temperatures or absolute temperatures. For example, if I have a temperature difference of, say, 2degF, running the conversion would interpret this as a 250K difference.

My solution would be have two conversions: convert_relative, convert_absolute. One could even be just called convert.

My method for handling ambiguous units is to specify the dimension to use during the conversion. In this case I could add a delta_temperature dimension to scale the change. For example:

dT = convert(10, 'F', 'C', dimension='delta_temperature')

— Reply to this email directly, view it on GitHub https://github.com/SasView/sasview/issues/2035#issuecomment-1192890086, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKU4SUCD55OJ3KOB4PZRHLVVL337ANCNFSM5UAFHXUQ . You are receiving this because you commented.Message ID: @.***>

--

Dr Lucas Wilkins +44 (0) 7505 915 726

Personal Website: http://www.lucaswilkins.com/ Alternate e-mail: @.***