CIRDLES / Squid

Squid3 is being developed by the Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) at the College of Charleston, Charleston, SC and Geoscience Australia as a re-implementation in Java of Ken Ludwig's Squid 2.5. - please contribute your expertise!
http://cirdles.org/projects/squid/
Apache License 2.0
12 stars 24 forks source link

Provide Significant Digit specifier for reports #460

Open bowring opened 4 years ago

bowring commented 4 years ago

Provide for setting the significant digits of 1-sigma absolute to drive the presentation of data in the reports, per ET_Redux functionality. Discussed by @bowring and @NicoleRayner .

bowring commented 3 years ago

Here is more proposed detail per a discussion I had with @auscopegeochemistry in issue #589 and moved here:

An improvement in the works is to provide simple formatting driven by

1) specify a default significant digits count (currently 15) with specific overrides for an individual field

2) a count of significant digits in the 1-sigma absolute uncertainty, as pioneered in ET_Redux. Thus for the case of 12345.678 with 1-sigma abs of 23.45678, the user could specify 3 significant digits of uncertainty and the report would show 12345.7 with 1-sigma abs of 23.4; and four digits would show 12345.68 with 1-sigma abs of 23.45.

Please chime in!

sbodorkos commented 3 years ago

@bowring I think it depends on the context for the 'simple formatting': are we talking about values displayed on-screen inside the Squid3 application, or values written to CSV reports? Same question for 'the report would show'... I am comfortable with the idea inside Squid3, where we are talking about formatting for display.

I am less comfortable with the idea applied to CSV output, because it will enable (especially inexperienced) users to over-round their data, and once that's done (and the CSV is detached from the Squid3 Project), it can't be undone. The literature is already awash with over-rounded data (e.g. data-tables that do not have enough significant digits/decimal places to permit verification of a weighted mean that the author asserts they have calculated), and it is the bane of my existence. Comparatively, under-rounding is a much lesser crime, and for as long as you have "too many" significant digits, you have the option of altering the display of your data to portray fewer. Obviously the reverse is not true.

Related to this, I think we need to be careful about presuming too far about the range of purposes a diverse group of end-users might find for Squid3 geochronology data delivered as a webservice. There might well be an optimum number of significant digits for 'traditional' geological users, but if your goal is to reconstruct published calculations (for example), every digit matters, no matter how 'insignificant'.

Excel-based SQUID and Isoplot never deliver numeric data at anything less than full (Microsoft) double precision, although the display of that data is invariably streamlined. One drawback of our CSV-based reports is that we are not able to distinguish between the display of a numeric value and its actual value.

None of this means you shouldn't go ahead with it, but it would be best if it was an opt-in feature (rather than an opt-out), because it will be dangerous in inexperienced hands.

auscopegeochemistry commented 3 years ago

Hi Jim and Simon (@bowring @sbodorkos), Thanks for including me in the discussion and thanks for your detailed explanations on earlier queries! As a user I do not mind to 'see' a lot of significant digits in the preview / peek window (for me it would be great to be able to scale the interface font, perhaps if the peek window became filled with digits, some 'insignificant' digits can be dropped from view). I do agree with Simon that the csv or any output data file should contain all the information possible. Best Alex