International-Soil-Radiocarbon-Database / ISRaD

Repository for the development and release of ISRaD data and tools
https://international-soil-radiocarbon-database.github.io/ISRaD/
24 stars 15 forks source link

QA/QC numerical limits on Fraction Modern values #128

Closed Kate-Heckman closed 5 years ago

Kate-Heckman commented 5 years ago

Would it be possible for the QA/QC to create a warning if the Fraction Modern values are greater than 2? I'm reviewing a template right now, and someone entered percent modern in the fraction modern column and it passed QA/QC. I don't know how difficult it is to add code to the QA/QC, but this addition would be helpful.

Many thanks,

Kate

jb388 commented 5 years ago

Hi Kate, This is an easy addition. I actually thought it was already implemented (the range was 0 to 1.8 for a while). I can look into it. Jeff

Kate-Heckman commented 5 years ago

Hi Jeff,

Thanks! It may already be in there. The template I’m reviewing is from back in May, so the QA/QC was probably updated since then.

Many thanks!

Kate

From: Jeff B [mailto:notifications@github.com] Sent: Friday, December 14, 2018 11:35 AM To: International-Soil-Radiocarbon-Database/ISRaD ISRaD@noreply.github.com Cc: Heckman, Katherine A -FS kaheckman@fs.fed.us; Author author@noreply.github.com Subject: Re: [International-Soil-Radiocarbon-Database/ISRaD] QA/QC numerical limits on Fraction Modern values (#128)

Hi Kate, This is an easy addition. I actually thought it was already implemented (the range was 0 to 1.8 for a while). I can look into it. Jeff

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/International-Soil-Radiocarbon-Database/ISRaD/issues/128#issuecomment-447379225, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmcCLjUO2_IIdjxEMaL3-yR5-h_DZB2uks5u49MygaJpZM4ZTzH3.

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

greymonroe commented 5 years ago

The template info file currently does not have a maximum value for frc_fraction_modern (although it has a minimum = 0). We can add a maximum to the info file and the QAQC should update accordingly. We just need to run ISRaD.build to make sure that it doesnt mess anything up before pushing to the repo.

First, can we confirm that a maximum value is needed for this column? There might be a reason it wasnt there originally that we need to consider.

coreylawrence commented 5 years ago

My personal opinion is in line with Kate's request. That is, there is a theoretical maximum value for natural abundance radiocarbon. I think 2 is a reasonable cut-off. If we wanted to be more conservative we could use the maximum value of the atmosphere during the bomb peak (which i don't know off the top of my head). However, I'd guess that is setting things too high.

Short version, yes we should implement a maximum (for fraction modern and for delta 14C) we just need to decide on what value to use.

As a side note, we might include a specific error statement for values of the variables falling outside of the set range that reads something like: "The radiocarbon values reported fall outside of the acceptable range for natural abundance measurements, please check your data. If the values you have reported are correct, your data are either from a tracer study which cannot be included in this database, or your samples may be contaminated. In either case, please contact ISRaD for additional guidance."

jb388 commented 5 years ago

The reason we didn't implement this initially was because of the possibility of labeled samples.

Do we want to categorically exclude labeled samples?

Kate-Heckman commented 5 years ago

Hi Jeff,

I’m still digging through furlough emails. If you haven’t already resolved this issues, these are my brief thoughts: I would vote for excluding labeled samples. Besides EBIS, I don’t think there are a lot of relevant soil studies that even use 14C label.

Many thanks,

Kate

From: Jeff B [mailto:notifications@github.com] Sent: Wednesday, December 19, 2018 2:39 PM To: International-Soil-Radiocarbon-Database/ISRaD ISRaD@noreply.github.com Cc: Heckman, Katherine A -FS kaheckman@fs.fed.us; Author author@noreply.github.com Subject: Re: [International-Soil-Radiocarbon-Database/ISRaD] QA/QC numerical limits on Fraction Modern values (#128)

The reason we didn't implement this initially was because of the possibility of labeled samples.

Do we want to categorically exclude labeled samples?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/International-Soil-Radiocarbon-Database/ISRaD/issues/128#issuecomment-448718825, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmcCLirnfTykMLENJa45Uw6f9zxvDNetks5u6pXXgaJpZM4ZTzH3.

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

kjmcfarlane commented 5 years ago

I agree with Kate- I don't think we want to include labeled samples. It will be too easy to mistakenly include low level labeling data with natural abundance (e.g., EBIS or pulse-chase labeled litter to SOM) because they look too similar to bomb-spike values.

coreylawrence commented 5 years ago

I also agree on that point. I should probably add a line about that in the manuscript as well...

kjmcfarlane commented 5 years ago

Good idea, Corey. I also think the QAQC on the FM is important - I had an old template (one of Elder Paul's papers) that I worked on that had PMC in the FM column.

jb388 commented 5 years ago

Hi All---thanks for weighing in. I agree with all the sentiments here. I'll implement an upper limit for FM (thinking 1.8), and we should be sure to clarify in our documentation that data from labeled samples will not be ingested.

jb388 commented 5 years ago

OK, so now I am remembering that we discussed using "Inf" as a proxy for the fm value in studies that reported samples with bomb-14C as "modern". Obviously these records fail QAQC when the upper limit is set to 1.8.

Unless we can come to consensus for an alternative to reporting "modern" fm values, the limits cannot be imposed in QAQC.

Kate-Heckman commented 5 years ago

I'm wondering in what way a value of "modern" could be useful? FM values of "modern" can't be included in any statistical test or model unless the whole Fm dataset was reclassified into categorical variables. How many times is data reported as "modern" instead of giving actual numerical values? I would suggest removing any data that isn't reported as a number and imposing the min=0, max=1.8 rule. Just my opinion though...

kjmcfarlane commented 5 years ago

A value of "modern" isn't useful, except that it does mean FM >/= 1. I worry "inf" is misleading. Is it a safe assumption that if the values were put in as "modern" there is no quantitative data available to calculate a useful value? (e.g. FM was calculated from a conventional radiocarbon date).

Kate-Heckman commented 5 years ago

Cool! Thanks for working through all the devils in the details!

From: Jeff B [mailto:notifications@github.com] Sent: Friday, February 01, 2019 5:32 AM To: International-Soil-Radiocarbon-Database/ISRaD ISRaD@noreply.github.com Cc: Heckman, Katherine A -FS kaheckman@fs.fed.us; Author author@noreply.github.com Subject: Re: [International-Soil-Radiocarbon-Database/ISRaD] QA/QC numerical limits on Fraction Modern values (#128)

Hi All---thanks for weighing in. I agree with all the sentiments here. I'll implement an upper limit for FM (thinking 1.8), and we should be sure to clarify in our documentation that data from labeled samples will not be ingested.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/International-Soil-Radiocarbon-Database/ISRaD/issues/128#issuecomment-459678647, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AmcCLhS4ySSEriHAZTispTnaCzb_qJZvks5vJBezgaJpZM4ZTzH3.

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

alkalifly commented 5 years ago

Unfortunately, there ARE some studies that just report "modern" for some layers, viz. Leavitt_2007 and Shen_2001. Both of those studies report conventional (uncalibrated) radiocarbon ages, but for the shallowest layers, they avoid the awkwardness of a negative radiocarbon age by just saying "modern".

In these cases, the "inf" value serves as a flag to let us know that the fraction modern is greater than 1. Limited as it may be, this is still useful information to have for certain types of analysis (e.g., constraining global models), and it would be a disservice to remove it altogether. But implementing a maximum value for FM would cause both of the entries I mentioned to immediately fail QAQC, unless that information was removed.

I'm not sure what the best solution is at this point, but it seems like we should be able to keep this information. Is the only reason to enforce a limit to prevent people from entering pMC instead of FM? Is there some other way of addressing that without prohibiting the "inf" flag?

jb388 commented 5 years ago

Thanks for chiming in, Paul. I think that due diligence on the part of expert reviewers should catch most of these problems. However, it is nice to have the "upstream" check.

An alternative solution could be a fm_14C_modern column with yes/no options, which would ONLY be used when fm/14C data are reported as "modern".

alkalifly commented 5 years ago

Couldn’t we just as easily have the code check to see if the value is > 1.8 OR is Inf?

greymonroe commented 5 years ago

Couldn’t we just as easily have the code check to see if the value is > 1.8 OR is Inf?

Sure, we would just have to write it in the QAQC function. The code is trivial. However, we have discussed such issues in the past and tried to avoid "special cases" like this where the controlled vocab/values deviate from the structure built into the Template Info file. The overall thinking was that this is a slippery slope where our strict rule that the Info file contains all the information needed to know what values to enter in templates no longer is true. When we start putting "asterisks next to" columns, it becomes harder to keep track of whats going on.

These things being said, we can certainly make whatever changes we decide on to the QAQC. Just need to decide as a group if its important enough. let me know

alkalifly commented 5 years ago

In that case, could we not just specify our desired rule (0–1.8 OR Inf) into the info file? This seems less like a "special case" to me than adding a whole new column just to flag values reported as modern. Or can allowed values only be specified within a single continuous range?

At this point, it seems like we have the following options:

  1. Change nothing, go on as we have been, and count on expert reviewers to catch any mistakes where people entered percent rather than fraction modern.

  2. Implement a check to make sure that values are valid, with valid values being 0–1.8 OR Inf (or something other than 1.8, if preferred)

  3. Implement a check for a single continuous range (e.g., 0-1.8) without allowing Inf, but have a separate column that flags samples reported as "modern"

  4. Implement a check for a single continuous range and just throw out all of the data reported as "modern"

Personally, from a user perspective, I'm fine with any of the first three options. The only option I am strongly against is 4.

jb388 commented 5 years ago

My vote would be for option 3. This seems to be the most logical given our overall approach, i.e. adding columns as needed, and keeping the info file consistent.

greymonroe commented 5 years ago

yes 3 is what we have been doing as adding columns is fairly straightforward

alkalifly commented 5 years ago

This is not correct; so far, we have been doing 1, not 3. There ARE currently templates with Inf in the fraction modern column, and we do NOT have any column to flag values reported as modern. If we wanted to adopt option 3, it would require making such a column in the master template, AND going through old templates to convert them to the new protocol.

On Feb 17, 2019, at 13:13, Grey Monroe notifications@github.com wrote:

yes 3 is what we have been doing

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

greymonroe commented 5 years ago

Yes sorry, i mean in general, we have added flagging columns in situations like this

alkalifly commented 5 years ago

Gotcha. Yes, the extra flag column would definitely be the most consistent with what we already have for other situations. I just want to make sure we know what would have to have to happen to successfully implement it.

On Feb 17, 2019, at 13:53, Grey Monroe notifications@github.com wrote:

Yes sorry, i mean in general, we have added flagging columns in situations like this

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

jb388 commented 5 years ago

Currently no upper limit is enforced for FM values by QA/QC, and "Inf" is used for studies reporting FM as "modern".