International-Soil-Radiocarbon-Database / ISRaD

Repository for the development and release of ISRaD data and tools
https://international-soil-radiocarbon-database.github.io/ISRaD/
24 stars 15 forks source link

fill_dates.R generates wildly incorrect dates #171

Closed aahoyt closed 5 years ago

aahoyt commented 5 years ago

ISRaD.extra.fill_dates.R seems to be incorrectly filling inc_obs_date_y from lyr_obs_date_y with dates that are sometimes extremely different.

Examples (should match) Dutta_2006: template: lyr_obs_date_y = 2001, inc_obs_date_y is blank ISRaD_extra: inc_obs_date_y = 1983 (should be filled from lyr_obs_date_y)

Taylor_2015: template: lyr_obs_date_y = 2012 or 2013, inc_obs_date_y is blank ISRaD_extra: inc_obs_date_y = 2012, 1998, 1970, 1971, 1986 (should be filled from lyr_obs_date_y)

In total there are 267 profiles with a inc_date vs lyr_date mismatch in ISRaD_extra, which are sometimes extremely different. In ISRaD_data, there are 62 date mismatches, but normally all different by 1-2yrs, which is reasonable (eg samples collected in 2001, incubated in 2002).

I don't see any obvious reason this date mismatch is being introduced. Also, this could be accounting for or related to our issues with the ISRaD_extra delta delta function, which was also never fixed.

greymonroe commented 5 years ago

The fill date function looked wrong. it wasnt doing any matching between the layer and fraction tab for example. just basically saying if the frc date was NA then make it a value from lyr date but not specifying which one. I am making the correction now.

jb388 commented 5 years ago

Nice catch, Grey. Hopefully this fixes it! @aahoyt you're probably right that this is the issue with the delta delta function---that function itself seems fine.

aahoyt commented 5 years ago

Thanks Grey! Let me know when it's updated & I will check the delta delta calculations

greymonroe commented 5 years ago

Ok can you check now?

On Feb 24, 2019, at 10:57 AM, Alison Hoyt notifications@github.com wrote:

Thanks Grey! Let me know when it's updated & I will check the delta delta calculations

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/International-Soil-Radiocarbon-Database/ISRaD/issues/171#issuecomment-466789149, or mute the thread https://github.com/notifications/unsubscribe-auth/AP5w_ALlIbZRf4l0XegXfKhbpxUM0b8rks5vQrZsgaJpZM4bOdeK.

aahoyt commented 5 years ago

checking now

aahoyt commented 5 years ago

Based on my checks, the date issue doesn't seem to be fixed. Here's my code to check with a figure. Dates should fall mostly along the 1:1 line

Check date issue:

inc_data_check <- ISRaD_extra$incubation %>% #Start with incubation data left_join(ISRaD_extra$layer) #Join to layer data ggplot()+ geom_point(data = inc_data_check, aes(x = inc_obs_date_y, y = lyr_obs_date_y), size = 5)+ theme_bw() filldateissue

And in case I just wasn't able to update the package for some reason, I downloaded the ISRaD_extra file and searched for Dutta_2006 on the incubation tab (inc_obs_date_y still says 1983 instead of 2001) https://international-soil-radiocarbon-database.github.io/ISRaD/database/

greymonroe commented 5 years ago

ok it should be fixed now, but lets double check.

screen shot 2019-02-24 at 1 37 58 pm
aahoyt commented 5 years ago

Yes, looks fixed to me too. Thanks!

The only study where the incubation date is now earlier than the layer date (how do you incubate a sample before you collect it?) is Czimczik_2007 (but issue comes from the template, not the function)

aahoyt commented 5 years ago

However, the filled frc_obs_date_y looks like perhaps it is still having a similar, but less widespread issue?

(red is original, black is filled, so we would expect all black along the 1:1 line)

Example study with issues to check is Baisden_2002. Blank frc_obs_date_y from profiles Riverbank1997 (lyr_obs_date_y = 1997) get filled to 1949 (looks like a value for other fractions on the same template) instead of 1997. frc_date_fill

aahoyt commented 5 years ago

Also, we should check to see if the various other fill functions are facing similar issues...

greymonroe commented 5 years ago

ok. see fig1 for new filled dates. should be correct now (at least the fill part, there are still some inconstancies that might be some errors in the datasheets).

screen shot 2019-02-24 at 7 24 33 pm

below are the latitudes and longitudes. There appear to be some issues, but i checked and it doesnt seem to be caused by the fill function.

screen shot 2019-02-24 at 7 29 12 pm screen shot 2019-02-24 at 6 48 17 pm
alkalifly commented 5 years ago

For the longitudes, the location of some of those dots (the ones that are way off) looks like a missing or spurious negative sign

On Feb 24, 2019, at 16:31, Grey Monroe notifications@github.com wrote:

ok. see fig1 for new filled dates. should be correct now (at least the fill part, there are still some inconstancies that might be some errors in the datasheets).

below are the latitudes and longitudes. There appear to be some issues, but i checked and it doesnt seem to be caused by the fill function.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

greymonroe commented 5 years ago

@alkalifly yep youre probably right. do we want to manually change these?

jb388 commented 5 years ago

Just to note that there are some entries that have different pro_lat/pro_long than site_lat/site_long (i.e. they shouldn't line up on the 1:1 line).

greymonroe commented 5 years ago

This is fixed