lizzieinvancouver / ospree

Budbreak review paper database
3 stars 0 forks source link

make new speciescomplex for high-quality species #308

Closed lizzieinvancouver closed 4 years ago

lizzieinvancouver commented 5 years ago

@dbuona @cchambe12 Moving part of #304 to here ... I asked ... Do either of you recall (or can find) our all notes on a more stringent cut-off? I recall something like two papers that manipulate two cues OR one paper that manipulates three cues and I could have sworn we wrote this down, but cannot find it. (I only have found this in issue #232 ...)

I have since found this wiki page here which says "Each species/taxa we do use must have 2 or more datasets representing at least 2 of the 3 predictors (chill, force, photo) ... ideal is 2 or more datasets representing all 3 predictors."

I recall a recent discussion this summer where @dbuona reported a more recent version we'd been using which was (maybe?) each species/complex must be in one study if it manipulates all three cues or >1 study if all three cues are manipulated across studies.

I think we maybe did this already (or at least have notes in a wiki page), but I cannot find the code. If we don't have the code I'd like to get something like it ASAP and add it to the end of bb_cleanmergeall.R for now so we can write out a new species list for traits, ranges and phylogeny work.

dbuona commented 5 years ago

@lizzieinvancouver As I am working on making a new species list, I have noticed there are typos in genus/species names that are making it appear that a single taxon are difference species (eg. Larix decidua and Larix decididua). I was looking back at cleanmergeall.R and was wondering.. did we run cleaning/clean_spp_match.R for the added data?

lizzieinvancouver commented 5 years ago

@dbuona Thanks for catching that! I did run cleaning/clean_spp_match.R a couple weeks back and the commented out taxon cleaning script (it works off a big list so is slow) ... but obviously it didn't take. Could you run those scripts again and add any cleaning as needed?

dbuona commented 5 years ago

Okay! I fixed a few species names and output a new ospree_clean.csv. The chilling cleaning doesn't run for me (chillmerge_all.R) doesn't seem to run on my computer, and I wasn't sure the status of chilling calculations anyway. @cchambe12 if its good to go would you be able, to run it and provide a new ospree_clean_withchill.csv?

dbuona commented 5 years ago

@cchambe12 @AileneKane @MoralesCastilla @DeirdreLoughnan @DarwinSodhi @md-garner @lizzieinvancouver Hi everyone... I wanted to make you aware of our preliminary species list for the expanded ospree data. Please note that all the source files in bb_cleanmergeall.R have not been updated, so this list is subject to change over the next week or 2 ask we all finish our tasks. But in the meantime, you can find the species list on the txt file analyses/bb_analysis/rough_sps_list_sept2019

lizzieinvancouver commented 5 years ago

@dbuona @cchambe12 As discussed yesterday, we're not sure we have new code working how we want. We think we want (but could be convinced otherwise) species or complexes where

cchambe12 commented 5 years ago

@lizzieinvancouver @dbuona I will try and tackle this now.

Quick questions:

1) All three cues manipulated in one study - would make a species

2) all three cues across more than one study - would make both species and complexes or are we just looking for species at this point?

3) So essentially we are just including species/complexes that manipulate all three cues regardless of number of studies?

cchambe12 commented 5 years ago

@lizzieinvancouver @dbuona Okay I think this is all set... Gives us 65 species and complexes (55 species, 10 complexes) @dbuona would you mind checking the code? I left notes throughout but if it doesn't make sense then I'll add more notes!

lizzieinvancouver commented 5 years ago

Sorry I was slow on this ... @cchambe12

  1. All three cues manipulated in one study - would make a species YES.

2a. all three cues across more than one study - would make both species and complexes or are we just looking for species at this point? SPECIES and COMPLEXES still.

2b) So essentially we are just including species/complexes that manipulate all three cues regardless of number of studies? YES! Good point, unless we can find our old rules and want to revert to those because we think they're better ....

cchambe12 commented 5 years ago

So the species list I came up with is:

[1] "Acer_pensylvanicum" "Acer_pseudoplatanus" "Acer_rubrum" "Acer_saccharum"
[5] "Alnus_glutinosa" "Alnus_incana" "Aronia_melanocarpa" "Betula_alleghaniensis" [9] "Betula_lenta" "Betula_papyrifera" "Betula_pendula" "Betula_pubescens"
[13] "Cornus_cornuta" "Fagus_grandifolia" "Fagus_sylvatica" "Fraxinus_nigra"
[17] "Hamamelis_complex" "Ilex_mucronata" "Kalmia_angustifolia" "Larix_decidua"
[21] "Lonicera_canadensis" "Picea_abies" "Populus_grandidentata" "Populus_tremula"
[25] "Prunus_padus" "Prunus_pensylvanica" "Pseudotsuga_menziesii" "Quercus_complex"
[29] "Quercus_petraea" "Quercus_robur" "Quercus_rubra" "Rhamnus_complex"
[33] "Rhododendron_complex" "Ribes_nigrum" "Salix_smithiana" "Spirea_alba"
[37] "Tilia_cordata" "Ulmus_minor" "Ulmus_parvifolia" "Ulmus_pumila"
[41] "Ulmus_villosa" "Vaccinium_myrtilloides" "Viburnum_cassinoides" "Viburnum_lantanoides"
[45] "Vitis_vinifera"

But if someone wouldn't mind double checking the code that would be great! And let me know if more notes are needed.

lizzieinvancouver commented 5 years ago

@cchambe12 Seems close! But I noticed this:

dim(d)
bb.wtaxa<-full_join(d, accepties) 
dim(bb.wtaxa) # gaining rows here, which is bad (9 rows I think)

I don't see a reason we should gain rows on this join so I think something is wrong.

cchambe12 commented 5 years ago

@lizzieinvancouver phew great catch! I was accidentally changing the species name to 'complex' for the complexes so it was aligning with our data appropriately. It should be all fixed now. The updated species list is below: (unless I should remove crops...?)

[1] "Abies_alba" "Acer_pensylvanicum" "Acer_pseudoplatanus" "Acer_rubrum" "Acer_saccharum"
[6] "Aesculus_hippocastanum" "Alnus_glutinosa" "Alnus_incana" "Alnus_rubra" "Aronia_melanocarpa"
[11] "Betula_alleghaniensis" "Betula_complex" "Betula_lenta" "Betula_papyrifera" "Betula_pendula"
[16] "Betula_pubescens" "Cornus_cornuta" "Corylus_avellana" "Fagus_grandifolia" "Fagus_sylvatica"
[21] "Fraxinus_excelsior" "Fraxinus_nigra" "Hamamelis_complex" "Ilex_mucronata" "Juglans_complex"
[26] "Kalmia_angustifolia" "Larix_decidua" "Lonicera_canadensis" "Malus_domestica" "Picea_abies"
[31] "Picea_glauca" "Pieris_japonica" "Pinus_complex" "Populus_grandidentata" "Populus_tremula"
[36] "Prunus_avium" "Prunus_complex" "Prunus_padus" "Prunus_pensylvanica" "Prunus_persica"
[41] "Pseudotsuga_menziesii" "Pyrus_complex" "Pyrus_pyrifolia" "Quercus_complex" "Quercus_petraea"
[46] "Quercus_robur" "Quercus_rubra" "Rhamnus_complex" "Rhododendron_complex" "Ribes_nigrum"
[51] "Salix_smithiana" "Sorbus_aucuparia" "Sorbus_commixta" "Sorbus_complex" "Spirea_alba"
[56] "Syringa_vulgaris" "Tilia_cordata" "Ulmus_minor" "Ulmus_parvifolia" "Ulmus_pumila"
[61] "Ulmus_villosa" "Vaccinium_myrtilloides" "Viburnum_cassinoides" "Viburnum_lantanoides" "Vitis_vinifera"

cchambe12 commented 5 years ago

@cchambe12 remove crops in code! And add list of 'complex' species

cchambe12 commented 5 years ago

New no crops list: [1] "Abies_alba" "Acer_pensylvanicum" "Acer_pseudoplatanus"
[4] "Acer_rubrum" "Acer_saccharum" "Aesculus_hippocastanum" [7] "Alnus_glutinosa" "Alnus_incana" "Alnus_rubra"
[10] "Aronia_melanocarpa" "Betula_alleghaniensis" "Betula_complex"
[13] "Betula_lenta" "Betula_papyrifera" "Betula_pendula"
[16] "Betula_pubescens" "Cornus_cornuta" "Corylus_avellana"
[19] "Fagus_grandifolia" "Fagus_sylvatica" "Fraxinus_excelsior"
[22] "Fraxinus_nigra" "Hamamelis_complex" "Ilex_mucronata"
[25] "Juglans_complex" "Kalmia_angustifolia" "Larix_decidua"
[28] "Lonicera_canadensis" "Picea_abies" "Picea_glauca"
[31] "Pieris_japonica" "Pinus_complex" "Populus_grandidentata" [34] "Populus_tremula" "Prunus_avium" "Prunus_complex"
[37] "Prunus_padus" "Prunus_pensylvanica" "Prunus_persica"
[40] "Pseudotsuga_menziesii" "Pyrus_complex" "Pyrus_pyrifolia"
[43] "Quercus_complex" "Quercus_petraea" "Quercus_robur"
[46] "Quercus_rubra" "Rhamnus_complex" "Rhododendron_complex"
[49] "Salix_smithiana" "Sorbus_aucuparia" "Sorbus_commixta"
[52] "Sorbus_complex" "Spirea_alba" "Syringa_vulgaris"
[55] "Tilia_cordata" "Ulmus_minor" "Ulmus_parvifolia"
[58] "Ulmus_pumila" "Ulmus_villosa" "Vaccinium_myrtilloides" [61] "Viburnum_cassinoides" "Viburnum_lantanoides"

cchambe12 commented 5 years ago

And the list of species within complex: [1] "Betula_ermanii" "Betula_nana" "Betula_occidentalis"
[4] "Betula_pendula,pubescens" "Betula_populifolia" "Hamamelis_japonica"
[7] "Hamamelis_vernalis" "Hamamelis_virginiana" "Juglans_ailantifolia"
[10] "Juglans_cinerea" "Juglans_regia" "Juglans_spp"
[13] "Pinus_banksiana" "Pinus_contorta" "Pinus_nigra"
[16] "Pinus_strobus" "Pinus_sylvestris" "Pinus_taeda"
[19] "Pinus_wallichiana" "Prunus_cerasifera" "Prunus_cerasus"
[22] "Prunus_insititia" "Prunus_salicina" "Prunus_serotina"
[25] "Prunus_serrulata" "Prunus_tenella" "Pyrus_communis"
[28] "Pyrus_communis L." "Pyrus_elaeagnifolia" "Pyrus_ussuriensis"
[31] "Quercus_alba" "Quercus_bicolor" "Quercus_coccifera"
[34] "Quercus_ellipsoidalis" "Quercus_faginea" "Quercus_ilex"
[37] "Quercus_pubescens" "Quercus_shumardii" "Quercus_velutina"
[40] "Rhamnus_alpina" "Rhamnus_cathartica" "Rhamnus_frangula"
[43] "Rhododendron_canadense" "Rhododendron_dauricum" "Rhododendron_mucronulatum" [46] "Rhododendron_prinophyllum" "Rhododendron_simsii" "Sorbus_aria"
[49] "Sorbus_decora" "Sorbus_intermedia" "Sorbus_torminalis"

lizzieinvancouver commented 4 years ago

I reviewed this again and think it looks good -- I checked the code again (briefly) and spot-checked a few of the species complexes built. I got a different number than Cat but I think it just has to do with the crops, so I think okay.

@cchambe12 Can you note here which data you built to get the species list above then close this issue!

cchambe12 commented 4 years ago

Species list is built using analyses/bb_analysis/source/speciescomplex.multcues.R

Requirements:

  1. all three cues were manipulated in one study (i.e., two levels) - applies to species
  2. all three cues were manipulated somehow across >1 study (i.e., each study used must have two levels of at least one cue) - applies to species and complexes

And then crops were removed: Actinidia deliciosa, Malus domestica, Vitis vinifera, Ribes nigrum

cchambe12 commented 4 years ago

Blueberries were sneaking into the dataset. I removed: "Vaccinium_ashei", "Vaccinium_corymbosum", "Vaccinium_myrtilloides" from source/speciescomplex.multcues.R

lizzieinvancouver commented 4 years ago

@cchambe12 Vaccinium_myrtilloides is not cultivated, is it? It's what we have in New England! I think it should stay, but I can check what studies it is in if you give me the datasetIDs ....

cchambe12 commented 4 years ago

@lizzieinvancouver Right you're right! Good catch! Okay, I will add Vaccinium_myrtilloides back in. Do I need to worry about the other two Vacciniums for the bbculdesac models?

lizzieinvancouver commented 4 years ago

@cchambe12 @AileneKane Yes, we should probably remove the blueberry crops there too and fix in master....

cchambe12 commented 4 years ago

Should be all set!!

cchambe12 commented 4 years ago

@lizzieinvancouver If we are now using speciescomplex.multcues.R does that mean we need to change our flagging in bbstanleadin.R to use.multcuespp==TRUE ? Should I just change the speciescomplex.multcues.R code to be simply speciescomplex.R to better match bbculdesac?

cchambe12 commented 4 years ago

@AileneKane in bbculdesac did we ever use the speciescomplex.multcues.R source code which would make the flag use.multcuespp==TRUE?

cchambe12 commented 4 years ago

Updated bb_analysis/source/bbstanleadin.R to include speciescomplex.multcues.R