Closed markbandstra closed 7 years ago
Here is a script you all can try that uses pint for various types of quantities, please give it a try.
I'm leaning toward not wanting to use pint, instead relying on clear conventions, clear variable names, and good testing.
A similar question, though, is whether we want to use uncertainties for uncertainties? It could save a lot of manual error propagation which has potential for bugs. The API doesn't have to rely on it, I think, but we could use uncertainties under the hood.
I am a huge fan of uncertainties. If you're just using gaussian error propagation it takes care of all that for you. If you're doing something fancier (e.g., asymmetric error bars) you would want to write your own stuff anyway.
I did implement pint
in an electron range module that had some conversions between energies, lengths, densities, and mass thicknesses, in order to try pint
for myself. It is elegant in some ways, but other things I don't like (see below). So I still vote that we avoid using pint
.
(Specifically:
to_base_units()
. I have become convinced that pint
is too much of a hassle for our purposes. Perhaps in the future it might have some use, but I agree that we should just use conventions and testing. I have removed the pint
dependency from the xcom
branch.
My proposal is:
(CGS? You're such an astrophysicist...) I'm fine with CGS. Most of the lengths we work with are small (order of the size of detectors), and if densities are involved then g/cm3 make sense.
Until you suggest we use ergs for something. There I'll draw the line. ;-)
What about variable naming convention, do we still suffix energy variables with _kev
, lengths with _cm
, etc.?
Good question, I am worried that appending units to variable names could end up making things too verbose, but I do see the utility of doing that. What do others think?
One possible compromise would be to append unit names onto publicly accessible arguments and properties, and leave them off (or optional) on internal variables and attributes.
Can re-close this issue? I think we have decided to prioritize simplicity over using pint
through a combination of clear conventions, good documentation, and subscript hints.
I only hesitated because I wasn't very clear on when to name a variable with e.g. _kev
and when not to. But I'm okay closing it and revisiting the variable naming once we have more code to have a feel for it.
Sounds good.
Following up on our discussion about using pint, I looked a bit more into its usability in our ecosystem. It seems very tightly coupled with numpy and uncertainties to the point where using it is nearly transparent, but its integration with pandas is poor. Quantities with both uncertainties and units can be stored in pandas data structures, but they need to be handled one-by-one instead of as an entire DataFrame or Series. (This is a known issue with pandas not supporting different methods of incorporating units.)
This could be a deal-breaker if we decide to rely on pandas heavily in this project.
I, for one, am not a big pandas user so the cost-benefit of using pint is weighed heavily toward benefit. For example, is this spectrum in units of counts, or counts per second, or counts per second per keV? Is this branching ratio a percentage or a dimensionless number? I have an activity in Becquerels; how do I do the conversion to mCi again?