pont-us / PuffinPlot

A program to plot and analyse palaeomagnetic data
GNU General Public License v3.0
3 stars 0 forks source link

Switch to integer or fixed-point depth representation #376

Open pont-us opened 3 years ago

pont-us commented 3 years ago

Currently depth implementation is a bit of a mess: Datum returns it as a String (!) and Sample returns it as a double. Actually an internal integer representation would work better (can e.g. select by specified depth without worrying about mismatch due to precision) presented as a fixed-point value (metres with 2 d.p. being the most common format).

One potential worry: depth may not always be straight out of the magnetometer (e.g. might have been calculated and included in the file before it's loaded into PuffinPlot) and may in that case be arbitrary precision. How to accommodate this while preserving nice behaviour for the common case?

On a quick first look, BigDecimal looks good for this (but see http://www.stichlberger.com/software/java-bigdecimal-gotchas/ ). Can just instantiate straight from a String in an input file. We don't do any maths with the depth so performance shouldn't be a problem, but it still seems like overkill. Fixed precision (e.g. decimal4j) may be more appropriate.

One possibility: multiply depths by 1e9 on loading, store in a long (max. value =\~ 9e18), and divide by 1e9 for display – that gives a range of >1e9 depth units to a precision of 1e9 decimal places, far more than adequate for any conceivable real-world situation. This representation should be restricted to in-memory use, though. Even within PuffinPlot's own file format, I would prefer to see "1.01" rather than "1010000000" – makes the file format more self-explanatory. And it feels like a better solution than "multiply input by enough powers of 10 to cover the d.p., and store the conversion factor as a suite parameter", which has more potential for complications and bugs.

On reflection: I feel that whatever I do, the original depth string should be preserved and used for loading and saving, with the numerical representation just acting as a view at runtime. That's the easiest way to guarantee no loss of precision. Starting from that POV, BigDecimal seems the most sensible option.