MIxS-MInAS / extension-radiocarbon-dating

A MIxS extension proposal for 'radiocarbon dating' information about samples
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Feedback from @joeroe #5

Open joeroe opened 3 weeks ago

joeroe commented 3 weeks ago

@nevrome invited me to take a look at this. I've been professionally overthinking radiocarbon data for https://xronos.ch for a few years, so please excuse the pedantism, I can't help it anymore!

FWIW, the difficulty of precisely describing the parameters of a calibrated dates, and the existence of (many) different plausible calibrations of the same underlying date, led us away from incorporating this information alongside c14 dates in XRONOS (see https://github.com/xronos-ch/xronos.rails/issues/46 for some discussion). I tend to see calibrated dates as models, rather than data.

jfy133 commented 3 weeks ago

Thank you very much @joeroe ! No need for apologies, this is extremely helpful and I greatly appreciate your time!

I will definitely incorporate many of you suggestions! I need to think a few through first as for this specific context I need to balance 'precision' (so to say) with 'user friendlyness' from the PoV that the people we are primarily designing this for are not specialists, but palaeogenomicists who often will only really care about e.g. the calibrated ages (but I completely get your points), or sometimes need to think about the term names for something most identifiable by them.

For now (still have a month until I'm back full time), I have one outstanding question:

the calibration median was not something I myself had thought of (as my understanding indeed that it is a distribution with 'most likely' date point(s)), but rather a suggestion from @nevrome to add... I just want to check if he had any further motivation behind his suggestion for this?

nevrome commented 3 weeks ago

Hey @joeroe - thanks for taking the time and adding these very valuable comments.

About the calib_age_median: My perception is that few colleagues in archaeogenetics bother to give age information proper treatment via temporal resampling (from age ranges or post-calibration probability distributions). If the data does not include a median, then many may just compute a mid-point between the start and end point of whatever range you'll provide for their modelling application, which is arguably worse than the median. So my reasoning was guided by this conflict between 'accuracy' and 'convenience' you mentioned, @jfy133.

The same applies for the high density regions (HDRs), so what Joe introduces as "multiple ranges for a given threshold". My tool currycarbon computes them as well, both on a 1-sigma and a 2-sigma level (I'll have to rethink the term sigma here, I guess :smile:). I defined the "overall" 2-sigma range of a calibrated age as the start and end of the 2-sigma HDRs.

CalEXPR: [1] 1:3000±30BP
Calibrated: 1379BC >> 1364BC > 1238BC < 1131BC << 1124BC
1-sigma: 1364-1360BC, 1282-1197BC, 1169-1163BC, 1140-1131BC
2-sigma: 1379-1344BC, 1304-1124BC
                              ▁▁▁▁▁▁
                             ▁▒▒▒▒▒▒▁
                            ▁▒▒▒▒▒▒▒▒▁
               ▁          ▁▁▒▒▒▒▒▒▒▒▒▒▁    ▁   ▁▁
              ▁▒▁       ▁▁▒▒▒▒▒▒▒▒▒▒▒▒▒▁▁▁▁▒▁  ▒▒
            ▁▁▒▒▒▁▁    ▁▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▁▁▒▒▁
        ▁▁▁▁▒▒▒▒▒▒▒▁▁▁▁▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
 -1410 ┄──┬─────────────┬─────────────┬─────────────┬────────────┄ -1020
             > >                 ^              <<
               ─          ──────────────   ─   ──
             ──────    ───────────────────────────

Very interesting to hear that different tools do this differently. Would be good to document this, but indeed difficult to figure this out as a user. Maybe calib_software and calib_version are sufficient to encode it implicitly. @MartinHinz and I once had a plan to write a benchmarking paper for different calibration tools to compare exactly these details.

Of course HDR's are more accurate, but also more hard to work with. Colleagues with a desire to operate on this level of temporal accuracy will probably just re-calibrate themselves. And that is fine. As Joe says: calibrated dates are "models, rather than data", so whatever we provide here is essentially only for convenience.

There is the exception of sites where archaeologists created a chronological model and combined multiple lines of evidence to acquire more precise age estimates. In this case it would be better to use the carefully curated ages instead of re-calibrating with the sledge-hammer. Maybe it would make sense to add a field to MIxS-MInAS/extension-radiocarbon-dating for referencing one or multiple publications with more sophisticated age models. Our Varna paper may be an example for that.

So to summarise my suggestions:

  1. Document their drawbacks but keep calib_age_median, calib_age_oldest, calib_age_youngest as they are. They are for convenience and good enough for many applications.
  2. Introduce a field for referencing publications with more advanced age models.

Happy to get disabused about 1. :pray:

joeroe commented 3 weeks ago

If the data does not include a median, then many may just compute a mid-point between the start and end point of whatever range you'll provide for their modelling application, which is arguably worse than the median.

Well, according to Michczyński at least, it's no better. I admit I haven't looked into this myself, but it feels odd to contradict the (only?) published literature in a standard. I'm guilty of using point estimates in published work myself, but at least I knew I was simplfying. My worry here is that by including a field for a point estimate, you are communicating (as you say, to non-experts) that using a point estimate is unproblematic. If there is only the range, that at least gives a hint that there is more to calibrated radiocarbon than a single date.

Very interesting to hear that different tools do this differently. Would be good to document this, but indeed difficult to figure this out as a user.

Yeah, I looked into it while working on https://github.com/joeroe/c14 and https://github.com/joeroe/ruby-radiocarbon. Of course there's the usual elephant in the room: that I have no idea what OxCal is doing.

A benchmark or wider technical comparison paper would be interesting. Let me know if you need any help!

There is the exception of sites where archaeologists created a chronological model and combined multiple lines of evidence to acquire more precise age estimates. In this case it would be better to use the carefully curated ages instead of re-calibrating with the sledge-hammer.

I agree, but at this point are you still talking about a "radiocarbon date"? Is there scope in this standard to separate on the one hand the radiometric data, and on the other chronological models that use that data?

nevrome commented 3 weeks ago

Finally came around and read this little Michczyński paper. A beautiful idea - I wish it was more common to publish little experiments like this. My takeaway from this paper is a bit different from yours, though, and maybe even from the one of the author.

image

For me Figure 3 shows that these point estimates are usually not that far off from the true age. Good enough as a rough number to, for example, group aDNA samples by millennia. Very interesting to see that the mode, so the maximum of the distribution is performing so well. Michczyński himself (after arguing against any point estimate) concludes:

If it is really essential to use a point estimate for the calendar age of the sample, then the mode (the value of the calendar age that corresponds to the maximum of the probability distribution of a calibrated 14C date) may be accepted as a point estimate, but we should remember that important differences between the mode and the true calendar age of the sample appear for some periods, which are characterized by a specific shape of calibration curve (see Figure 4).

I think I'll add this to currycarbon.

One more tiny comment about what you wrote:

I'm guilty of using point estimates in published work myself, but at least I knew I was simplfying. My worry here is that by including a field for a point estimate, you are communicating (as you say, to non-experts) that using a point estimate is unproblematic.

My experience with (archaeo)geneticists has been that they are usually aware of the uncertainties of radiocarbon ages, although maybe not to the full extent (which I, by no means, could claim for myself :wink:).

joeroe commented 2 weeks ago

I also had in my head that the mode was the way to go; probably also from this paper. For example it's the default of c14::cal_point(). It's been a while since I read it to be honest.

Thinking about it again, as Michczyński says, the probable deviation from the mode is purely a function of the measurement error and the calibration curve. So shouldn't it be possible to come up with a point estimate ± e.g. the 95% margin on either side? I can imagine that'd be more convenient than a range for a lot of applications.