CIRDLES / Squid

Squid3 is being developed by the Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) at the College of Charleston, Charleston, SC and Geoscience Australia as a re-implementation in Java of Ken Ludwig's Squid 2.5. - please contribute your expertise!
http://cirdles.org/projects/squid/
Apache License 2.0
12 stars 24 forks source link

Errors for element concentrations #561

Closed NickRodionov closed 3 years ago

NickRodionov commented 3 years ago

Squid3 (as well as Squid2.5) does not report errors on U, Th, Pb_rad and any custom element (e.g. Hf,) concentrations for UPb routine analysis. Indeed, natural inhomogeneities of RMs are much greater than an errors measured in unknowns, yet I need to indicate its in my report table!

sbodorkos commented 3 years ago

Hello @NickRodionov and welcome!

I think the main reason Ken Ludwig omitted measurement uncertainties from elemental concentrations in SQUID 2.50 (and Squid3) is as you suggest, and that he did not want users to think that the high precision of individual elemental spot-measurements meant that those individual measurements were accurate, or repeatable/reproducible. I think he envisaged elemental concentrations being used primarily as a relative indication of abundance, within a single session in which all elemental concentrations had been derived from the same set of concentration RM measurements. Useful as a parameter by which data-rows might be sorted, for example.

But it is still an interesting question, because people might wish to quantify the relative contributions of spot-uncertainty vs. the 'spot-to-spot' error required to account for the full range of observed variation in elemental concentrations. So I am going to share what I have learnt about how Ludwig coded this in SQUID 2.50 (replicated in Squid3), using U (ppm) as an example. I have made use of User-defined equations in SQUID 2.50 to make the relevant calculations (which Ludwig hid from users) more transparent, in the hope that this will help users customise the equations/expressions for their own use (in SQUID 2.50 or Squid3), such as generating spot-uncertainties for Hf concentrations, for example.

My Task is named "Zircon 09pk exp=2 Rodionov" and it is attached in this ZIP, along with the SQUID 2.50 workbook I made with it (using my own Prawn XML file), and a 'frozen' version of the workbook in case it is useful and/or something goes wrong with the 'live' workbook:

Rodionov_Uppm_Task+SQ2.50books.zip

The Task resembles the classical U-Pb data-reduction routine developed by ANU as far as possible, although users should note that the attached version actually has 10 peaks (including mass 270), in order to match the only Prawn XML file I have in hand today. For 9-peak users, mass 270 can be deleted from the Task without affecting its function. Here are the Special U-Th-Pb expressions, following ANU:

Special_UThPb_Eqns

And here are the Reduce Data settings I used. Note that this exercise uses Harvard reference zircon 91500 as both the Isotopic RM and the Concentration RM, because that was the smoothest way to formulate the user-defined expressions. There is no reason why my expressions could not be adapted to use a Concentration RM that differs from the Isotopic RM; there would just be a few extra expressions involved:

ReduceData_Settings

Conceptually, the key is to think in terms of "concentration constants", which are evaluated the same way as the "calibration constants" we are all much more familiar with. Both are evaluated using NU-enabled expressions, so both have a %err (at spot-level) generated by "numeric perturbation" of the mass-peaks involved in the expression, using Ludwig's built-in algorithms. I have exposed some of the detail of the "concentration constant" calculation, by replicating the "Special U-Th-Pb expression for uranium concentration" as a User-defined expression ("U_Concen|Constant") in my Task (user-defined expression 7 below; in all cases, the vertical bar is used only to denote a position where a line-break is desired, for labelling purposes):

UserDefined_Eqns

There are four important user-defined expressions in this set:

The first is expression 7, which has ST and SA both switched off, so it is evaluated for every spot on each of the StandardData and SampleData sheets. An intermediate step in the calculations can be observed on the "Within-Spot Ratios" worksheet of the SQUID-book, which contains the Dodson-interpolated inter-scan values (and times) for all the isotopic ratios, as well as NU-switched expressions. Note that the inter-scan values for my expression "U_Concen|Constant" are identical to those obtained from the expression "ppmU" derived from Ludwig's Special U-Th-Pb expressions:

Within-Spot_Ratios

However, when we move to the StandardData and SampleData sheets, our "U_Concen|Constant" values and their %errs appear in full (StandardData!AV:AW, and SampleData!AJ:AK), whereas the identical "ppmU" data calculated by Ludwig's built-in expressions does not appear at all. Our new columns are analogous to the calibration-constant columns labelled either "Uncorr|Pb/U|const" (StandardData!BY:BZ) or "206Pb/|238U|calibr.|const" (SampleData!AN:AO): they contain the same type of data, but for a different, concentration-related expression.

The second is expression 9, which embodies what SQUID 2.50 and Squid3 actually do with respect to calculating a "U concentration constant" from measurements of the Concentration RM: a simple arithmetic mean of the StandardData U_Concen|Constant values, which ignores their %errs completely. This can be confirmed by comparing the Task-generated "U_Norm|Factor" (StandardData!AZ7) with the ConcStdConst value (labelled "Std mean U const.") at StandardData!CR13.

The third is expression 12, which generates the U (ppm) values for each and every spot, via the equation:

UCC[spot] / UNormFactor = HandUppm[spot] / Uppm[RM]

expressed as:

HandUppm[spot] = Uppm[RM] * UCC[spot] / UNormFactor

where Uppm[RM] in this case is 81 ppm (Wiedenbeck et al,, 1995), UCC[spot] is the spot-specific output of global expression 7, and UNormFactor is the single-value output of RM-based expression 9. Our hand-calculated output (StandardData!BC, and SampleData!AL) compares extremely well to the values calculated from Ludwig's built-in algorithm (StandardData!BN, and SampleData!BA):

HandUppm_vs_AutomaticUppm

I have expanded the full range of significant figures on StandardData for illustration. (Some values show minor differences at the final significant figure, and this is a good example of the numerical "noise" I talked about at the Zoom workshops. Column BC was calculated using Task equations in a front-end Excel spreadsheet, whereas column BN was evaluated by the SQUID 2.50 code in the back-end VBA 6 environment, and the answers are not always identical.)

The fourth is expression 13, which simply converts the %errs associated with U_Concen|Constant, to absolute uncertainties (1sigma, ppm) associated with Hand|Uppm. I think these are the numbers @NickRodionov wanted.

The remaining expressions (8, 10 and 11) were included to assess the relative contributions of (a) the measurement uncertainties and (b) the spot-to-spot error, to the total observed dispersion of spot U (ppm) values in the Concentration RM. Note that the measured %err values on U_Concen|Const for Harvard 91500 are of the order of 1–2% (1sigma).

The first approach is covered by expression 8, which calculates an inverse-variance weighted mean of the U_Concen|Constant values, with output uncertainties expressed as percentages, and auto-rejection of spots forbidden. Ludwig's WtdAv (and sqWtdAv) functions include a maximum-likelihood estimator, the most visible expression of which is the estimation of the most likely constant spot-to-spot error (1sigma, percent) needed to yield MSWD ~ 1 for the population, taking into account the measurement uncertainties.

It is obvious to the naked eye that the data are dispersed far beyond the measurement uncertainties, and sqWtdAv quantifies that: each of the measured 1sigma spot uncertainties (which are of the order of 1–2%), must be supplemented (in quadrature) by a constant spot-to-spot error of 14.5% (1sigma) before the MSWD of the population will be 1.

The second approach spans expressions 10 and 11 (because it seems necessary to do these calculations as separate steps in SQUID 2.50). Expression 10 evaluates the standard deviation of the of the U_Concen|Constant values for 91500 as an absolute value, and expression 11 converts it to a percentage.

The classical standard deviation can be thought of as an "end-member" type of Ludwig's maximum-likelihood estimator, in which the measurement uncertainties are completely disregarded (because their magnitudes are considered totally insignificant, relative to the spot-to-spot error, which the standard deviation approximates), and the value of 15.1% obtained from expression 11 compares well to the value of 14.5% returned by expression 8. This indicates that the approximation obtained by simply applying StDev directly is a good one, in this case (and for U (ppm) in general, I suspect).

Finally, I would say that we can use these data to make guesses about the real sources of spot-uncertainties in U (ppm), and their magnitudes. Expressions 8, 10 and 11 indicate that the true spot-errors on U concentration in this 91500 dataset are ~15%. What are the possible components?

  1. Counting statistics: We already know that measurement uncertainties make up 1–2% of that.
  2. SHRIMP-related reproducibility: Here, we can look at the 206/238 calibration constants for clues regarding the SHRIMP-specific component of repeatability. If we assume that 91500 is truly uniform in 206/238, and we note that the spot-to-spot error of 0.97% (StandardData!BF26) matches well with the individual spot-errors of the 204-corrected calibration constants (StandardData!BF), it seems reasonable to assume that a similar relationship might hold for U (ppm), if 91500 was truly uniform in U (ppm). In that hypothetical scenario, a further 1–2% of the U (ppm) uncertainty (i.e. matching the measurement uncertainties) would come specifically from SHRIMP's inability to reproduce measurement values from spot to spot, even in perfectly homogeneous materials (e.g. Stern & Amelin, 2003).
  3. True heterogeneity of RM: The two lots of 1–2% above still leaves us well short of the 'true' U (ppm) spot-errors of ~15%, and it seems likely that the remaining 10+% has nothing to do with the SHRIMP at all (not counting statistics, not repeatability/reproducibility), but instead purely reflects true variation in the U (ppm) of the RM itself, as we probably already knew.
cwmagee commented 3 years ago

Hi guys, I had a quick look at this, and I would like to point out that all the repeat measurements on grains (2.2, 7.2, 10.2) are generally within 5% of the initial value, suggesting that grain-to-grain variability is the main factor in differing U concentration here. cheers, Chuck

sbodorkos commented 3 years ago

@cwmagee true, although it is really "fragment-to-fragment" variability, because 91500 is a single (albeit huge) crystal.

NickRodionov commented 3 years ago

thank you all for the comments! It took me a long time to carefully understand everything. For this, respectively, I needed to refresh my memory a lot about Squid-2. Many thanks to @sbodorkos for your perfect clarifications! I just have to find out how to update your task relatively to Temora. We are just measuring Temora + 91500 as primary and additionally M257 as a secondary standard (of both ratio and concentration). Usually mount has more than two fragments of each concentration standard - both 91500 and M257. Therefore, having received at least 4 analyzes of them per session, I have long observingintercross variations in the influence of their inhomogeneity on the concentration value. Simon, please don't close this discussion, untill I may be asking some clarifying questions.

sbodorkos commented 3 years ago

Hi @NickRodionov, if I updated the Task so that 91500 is the concentration standard (and the errors in Uppm were propagated to all other analyses) even when Temora is selected as the ratio standard, would that help?

NicoleRayner commented 3 years ago

Hi @sbodorkos, for the above, are you talking about propagating errors from sources 1, 2 from above only? Or also 3 (true heterogeneity)? Without the latter, I think this gives a false sense of security on the error magnitude, and really we need a long terms sense of what the heterogeneity for a specific CRM is. This can be monitored if labs export their CRM data as individual cvs and/or the weighted mean of 238/196 for a given session and track the variability over time to assess heterogeneity. I would be worried about building the error propagation into a task and then it getting used indiscriminately (and without proper assessment of heterogeneity) vs as a separate expression.

NickRodionov commented 3 years ago

... I would be worried about building the error propagation into a task and then it getting used indiscriminately (and without proper assessment of heterogeneity) vs as a separate expression.

Hi @NicoleRayner! Indeed, I also share your concern. But the situation in our laboratory is such that we must demonstrate the entire sequence of obtaining each uncertainty. Actually this complete information will be provided only to an expert-metrologist: uncertainty = measured error + the averaged uncertainty in the corresponding CRM data + a notional inter-mount systematic uncertainty + etc. Really customer will get and see the total of them only! I see no other way, I will be happy if you advise me anything.

NickRodionov commented 3 years ago

Hi @NickRodionov, if I updated the Task so that 91500 is the concentration standard (and the errors in Uppm were propagated to all other analyses) even when Temora is selected as the ratio standard, would that help?

Hi @sbodorkos, that would be great!

NicoleRayner commented 3 years ago

Hi @NickRodionov - maybe I am missing something here but my thought was that Simon (or you, or me?) could write the expression, save it as an xml and post it here. They you/me/whoever else wants it could import that expression into their preferred task, save the entire task (including that new uncertainty expression) as a Custom squid3 task, then use that in their lab moving forward. So the calculation is there and exposed but only for those who choose to use it (and I am assuming those would be fairly advanced users and thus less likely to abuse the propagation).

NickRodionov commented 3 years ago

@NicoleRayner, I comletelly agree!

NicoleRayner commented 3 years ago

Perfect! I think writing this as a stand alone expression, not a task is the way to go. You can then bring this expression into whatever task you need.

cwmagee commented 3 years ago

Hi guys, This is just a personal observation from trying to run trace elements last year, but our supply of 91500 has a much larger chip-to-chip variation than our M127 or M257 supplies. G7 and G8 also seem to be more homogenous.

The chip-to-chip variation is much larger than analytical errors for all but the very low concentration elements (e.g. LREE).

sbodorkos commented 3 years ago

@NicoleRayner My component of uncertainty no. 2 is essentially a theoretical construct - I can't think of any way it could be quantified experimentally, because we would need a material (ideally a zircon-like matrix) that was homogeneous chemically (rather than just isotopically) within and between SHRIMP-pits, and I am not sure how such a material could be sourced/verified.

So I have proceeded on the basis that only no. 1 is real, and that all other sources of error must be attributed to no. 3. I will upload my updated SQUID 2.50 Task (in Task form because SQUID 2.50 needs between 6 and 10 user-defined Equations to do the work, depending on whether the Concentration RM is the same as the Isotopic RM or not) and some documentation to a subsequent post.

Streamlining this Task for use in Squid3 would be useful and educational (I myself don't yet have a good feel for how to do it), but it's a job I am not going to get to any time soon... in the short term, I leave it as an exercise for interested users :)

cwmagee commented 3 years ago

Hi Simon, We get good estimates of uncertainty #2 from repeat measurements on single chips for concentration standards, since zoning in most of them is gradual on the tens of microns scale. We should have a discussion of this in the trace element part of Cate's paper, if I ever get that far in the writeup. Sneak preview says that it's small for those trace elements with enough counts for counting stats not to dominate.

sbodorkos commented 3 years ago

@cwmagee it is a good point, I guess if you used one of these Sri Lankan gem zircons and mounted a bunch of large chips (15?), and then you did a set number of analyses (20?) on each chip, in a proper round-robin to minimise instrument-related factors, you could probably generate 15 estimates of SHRIMP-based spot-to-spot uncertainty, and you could compare those to each other somehow... And you could probably say something about chip-to-chip variation at the same time.

sbodorkos commented 3 years ago

@NickRodionov here is an updated Task. This one is set up to use 91500 as the concentration standard, but it explicitly assumes that 91500 is NOT the ratio standard (instead, that can be whatever you like, and I have used TEMORA2 in this example). In that sense, this Task is complementary to my original Task (which explicitly assumes 91500 is both standards). The ZIP contains the SQUID 2.50 Task, a 'condensed' version of the Prawn XML file I used for testing (to save you sourcing your own), as well as 'live' and 'frozen' versions of the SQUID 2.50 workbook containing the results.

SquidTask_Zircon 09pk exp=2 91500ConcOnly.Bodorkos.zip

Here are the 'Reduce Data' settings I used (note that some of the user-defined expressions are explicitly coded to only function on Sample-analyses prefixed '915'):

2021-02-05_ReduceData

And here is the set of user-defined Equations (3–4 and 7–13) that do the work:

2021-02-05_UserTaskEqns

Equation 3 calculates U Concentration Constants (and their numerically-perturbed %err values) solely for the designated concentration standard (here prefix 915). Equation 4 calculates then the simple arithmetic mean of the spot-values (following the practice of SQUID 2.50 and Squid3).

Equations 5 and 6 are purely to illustrate that the StDev-based solution @NickRodionov is already using to address this issue is both simple and effective (as the ensuing calculations demonstrate).

Equations 7 calculates the spot-to-spot (external) error based on the 91500 dataset. Unfortunately, the sqWtdAv function does not output MSWD, which is annoying. I guess I could derive it from the probability-of-fit and the number of analyses, if it would be useful/informative. Equation 8 isolates the external error (1sigma, %), which is necessary in SQUID 2.50 to enable the value to be used across both the StandardData and SampleData sheets in subsequent calculations.

Equation 9 duplicates the U Concentration Constants calculation (and their numerically-perturbed %err values), but this time for every spot in the StandardData and SampleData sheets. Equation 10 augments the numerically-perturbed %err value for each spot, by quadratic addition of the external error (1sigma, %) value derived from the 91500 measurements. This process is analogous to how we calculate 206Pb/238U ages for SampleData rows, using a single external error derived from the StandardData set.

Equation 11 converts the spot-specific concentration constants into U (ppm) values, by reference to the average value determined on 91500 (equation 4), and the "true" U (ppm) val;ue for 91500 (81 ppm; included in this Task as a Constant). Equations 12 and 13 simply convert each spot-specific numeric %err and expanded %err into 1sigma absolute values, with units of ppm.

Equations 14–16 are purely to illustrate the role of the external error when applied back to the 91500 dataset from which it was derived in the first place. Calculating the weighted mean of the 91500 analyses with expanded uncertainties should give a result with an MSWD ~ 1 (because this is the basis of Ludwig's maximum-likelihood estimator, which he built to approximate external errors). And it does work, although the effect is spoiled slightly by the failure of sqWtydAv to return an MSWD value. Hopefully the probability-of-equivalence value (~0.37) is familiar enough for people to recognise that the associated MSWD would be close to 1.

cwmagee commented 3 years ago

Simon, the 91500 chip to chip concentration variation is up to 50% (as, indeed, are the various literature values for Hf concentration), and the intra spot uncertainty is 0.5 to 1%, so you don't need very many spots to see the effect there. The various Nasdala RZs are better, but I think you can still pick chip to chip differences by eye off of a weighted mean. If I ever stop doing stupid yet urgent things I'll get up to that.

NickRodionov commented 3 years ago

@sbodorkos thank you!

bowring commented 3 years ago

@NickRodionov - are we ready to close this issue?

NickRodionov commented 3 years ago

@bowring - yes, thanks to all