KarinaGuo / Machine_Learning_in_Computer_Vision_for_Leaf_Traits

Datasets and scripts for a journal paper titled 'Using Machine Learning to Link Climate, Phylogeny and Leaf Area in Eucalypts Through a 50-fold Expansion of Current Leaf Trait Datasets'.
MIT License
1 stars 1 forks source link

how to handle austraits data #8

Open wcornwell opened 1 year ago

wcornwell commented 1 year ago

see issue: https://github.com/traitecoevo/austraits.build/issues/660

wcornwell commented 1 year ago

I think something like this:

austraits <- load_austraits(version = "3.0.2", path = "data/austraits")

la<-extract_trait(austraits,trait_names = "leaf_area")
Cor <- extract_taxa(la, genus = "Corymbia")
cor_field<-(Cor %>% join_all)$traits %>%
  dplyr::filter(collection_type=="field")

should just get us the field data. still working on figure out how to get out the juvenile leaves

wcornwell commented 1 year ago

ok been chatting with lizzy, I think this is better

library(austraits) 
library(tidyverse)
austraits <- load_austraits(version = "4.0.0", path = "data/austraits")

left_join(austraits$traits,austraits$taxa)%>%
  left_join(austraits$locations) %>%
  filter(genus %in% c("Eucalyptus","Angophora","Corymbia")) %>%
  filter(trait_name == "leaf_area") %>%
  filter(life_stage=="adult" & basis_of_record %in% c("field","literature","literature, field")) -> euc_subset
wcornwell commented 1 year ago

can discuss whether to exclude juvenile lvs....but probably should update to version 4

KarinaGuo commented 1 year ago

Thanks Will! New code is as below

library(austraits) 
#austraits <- load_austraits(version = "4.0.0", path = "data/austraits")
austraits <- readRDS("C:/Users/swirl/Downloads/austraits-4.0.0.rds")

AusTraits_loc <- austraits$locations %>% 
  filter (location_property %in% c("latitude (deg)", "longitude (deg)")) %>%
  filter (!is.na(value)) %>% 
  filter (value != "" & value != "unknown" & value != "NA") %>% 
  mutate (num_value = as.numeric(value))
AusTraits_loc <- reshape2::dcast(AusTraits_loc, formula = dataset_id + location_id + location_name ~ location_property, value.var = 'num_value')

EucalyptsArea <- left_join(AusTraits_loc, austraits$traits) %>%
  left_join (austraits$taxa) %>%
  filter (genus %in% c("Eucalyptus","Angophora","Corymbia")) %>%
  filter (trait_name == "leaf_area") %>%
  filter (life_stage=="adult" & basis_of_record %in% c("field","literature, field")) %>% 
  mutate (mask_area_results = as.numeric(value)/100) %>% 
  dplyr::select (dataset_id, location_id, "longitude (deg)", "latitude (deg)", taxon_name, mask_area_results, life_stage, basis_of_record)
colnames(EucalyptsArea)[3] <- "decimalLongitude"
colnames(EucalyptsArea)[4] <- "decimalLatitude"

coords <- data.frame(x=(EucalyptsArea[["decimalLongitude"]]),y=(EucalyptsArea[["decimalLatitude"]]))
points <- SpatialPoints(coords, proj4string = r@crs)

values <- raster::extract(r,points)
EucalyptsArea <- cbind.data.frame(values, EucalyptsArea) %>% 
  dplyr::filter (!is.na(Temp))
EucalyptsArea[1] <- EucalyptsArea[1]/10
EucalyptsArea[3] <- EucalyptsArea[3]/10
EucalyptsArea[4] <- EucalyptsArea[4]/10
EucalyptsArea[5] <- EucalyptsArea[5]/10
EucalyptsArea[6] <- EucalyptsArea[6]/10

EucalyptsArea <- EucalyptsArea %>% 
  mutate(log_MAP = log10(Prec))

p1 <- ggplot (data = EucalyptsArea, mapping = aes (x=Prec, y=mask_area_results)) +
  geom_jitter(alpha = 0.3, width = 0.1, height = 0.1, size = 0.6) +
  labs(y="Leaf area (cm2)", x = "Mean annual precipitation (mm)") +
  theme_bw() +
  theme (legend.position="none") +
  scale_y_continuous(trans='log10')  +
  scale_x_continuous(trans='log10') + 
  geom_abline(slope = 0.043, intercept = 0.207, linetype="dashed", 
              color = "red", linewidth=0.5)
wcornwell commented 1 year ago

seems done!

wcornwell commented 1 year ago

might have to think about whether to do something about this--about half include the petiole and half don't

Screenshot 2023-02-01 at 3 11 53 pm
KarinaGuo commented 1 year ago

Interesting, I guess thats the problem with unstandardised methods of measurements. Petioles can add another 5cm+ onto the measurement area... Just another thing to mention in the discussion?