hmorlon / PANDA

Phylogenetic ANalyses of DiversificAtion
24 stars 15 forks source link

InfTemp data Age duplicates? #39

Closed stiatragul closed 1 year ago

stiatragul commented 3 years ago

Hello PANDA dev team,

First of all, thank you for continuing to develop this great package!

I am currently fitting multiple fit_env models with different paleoclimate (temperature) reconstructions. The reconstructed climate data sets I'm working with only have 271 rows (i.e. 271 time steps) compared to the much higher resolution data(InfTemp) available as part of the RPANDA package (which has 17632 rows).

Upon closer inspection, I notice that the some of the InfTemp$Age are duplicates (same time) but the temperature values are different.

> head(InfTemp)
    Age Temperature
1 0.000    3.902176
2 0.000    2.900296
3 0.002    4.309984
4 0.002    5.172534
5 0.004    3.733446
6 0.004    4.309984

To make sure this is not a matter of rounding, I verified this with:

> InfTemp$Age[1] == InfTemp$Age[2]
[1] TRUE
> InfTemp$Age[3] == InfTemp$Age[4]
[1] TRUE

plot(InfTemp$Age[1:4], InfTemp$Temperature[1:4]) image

The only reference I saw about this temperature estimation is Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85 but I wasn't able to figure out why the Age column has duplicates. Could you explain why there are two Temperature values assigned to duplicated Age values?

Many thanks, Putter

FabienCondamine commented 3 years ago

Dear Putter,

Thanks for your email and interest in the RPANDA package.

Yes, the temperature data implemented in RPANDA sometimes has many data points per time unit. This is because the data comes from the Ocean Drilling Project that has sampled the oceanic crust worldwide with drilling cores spanning the Cenozoic mostly. So each drilling core provides data and the time period can overlaps between drilling cores such that there are multiple data points per time. For instance, the 5-Myr time is sampled multiple times around the world. In other words, the curve incorporates the geographic uncertainties / variations in temperature around the globe.

Among the authors, there is also Julien (Clavel) who knows very well this type of data, and can complete my explanation if not accurate.

Anyway, I invite you to read Zachos et al. (2001 - Science https://science.sciencemag.org/content/292/5517/686) but also Zachos et al. (2008 - Nature https://www.nature.com/articles/nature06588) or Cramer et al. (2009 - Paleoceano https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2008PA001683) and (2011 - Geophys. Res. Lett. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2011JC007255) for more information on this data.

I hope it helps. All the best,

Fabien

Fabien L. Condamine, Ph.D.

CNRS http://www.cnrs.fr/index.php/en, UMR 5554 Institut des Sciences de l'Evolution de Montpellier https://isem-evolution.fr/en/ Team Phylogeny and Molecular Evolution https://scholar.google.fr/citations?user=i_bc4KQAAAAJ&hl=en *Université de Montpellier, *Bât. 22 RDC, CC 064 Place Eugène Bataillon 34095 Montpellier Cedex 5 France

Personal website: www.fabiencondamine.org http://www.fabiencondamine.org

Le lun. 7 juin 2021 à 01:07, Putter Tiatragul @.***> a écrit :

Hello PANDA dev team,

First of all, thank you for continuing to develop this great package!

I am currently fitting multiple fit_env models with different paleoclimate (temperature) reconstructions. The reconstructed climate data sets I'm working with only have 271 rows (i.e. 271 time steps) compared to the much higher resolution data(InfTemp) available as part of the RPANDA package (which has 17632 rows).

Upon closer inspection, I notice that the some of the InfTemp$Age are duplicates (same time) but the temperature values are different.

head(InfTemp) Age Temperature 1 0.000 3.902176 2 0.000 2.900296 3 0.002 4.309984 4 0.002 5.172534 5 0.004 3.733446 6 0.004 4.309984

To make sure this is not a matter of rounding, I verified this with:

InfTemp$Age[1] == InfTemp$Age[2] [1] TRUE InfTemp$Age[3] == InfTemp$Age[4] [1] TRUE

plot(InfTemp$Age[1:4], InfTemp$Temperature[1:4]) [image: image] https://user-images.githubusercontent.com/11167730/120943010-b8c2bf00-c76f-11eb-80e3-a7453af0ea78.png

The only reference I saw about this temperature estimation is Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85 but I wasn't able to figure out why the Age column has duplicates. Could you explain why there are two Temperature values assigned to duplicated Age values?

Many thanks, Putter

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hmorlon/PANDA/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABT5APFDJ5ODMJP6UDKH4SLTRP5UFANCNFSM46GP3VCA .