Closed bselden closed 8 years ago
Has this been corrected subsequently? OK, I looked into it and it has not been corrected.
FYI, here's how I fix:
# quoted expressions to hold subsetting logic
exp1 <- quote(!is.na(spp) & flag=="BS-batch" & (taxLvl%in%c("species", "genus", "subspecies")))
exp2 <- quote(sapply(strsplit(spp, " "), '[', 1)) # will be used twice
spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # show cases
spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), genus:=eval(exp2)] # fix
spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # should show nothing
I found 41 instances. I think this takes care of problem, let me know otherwise. Will commit this fix on development and put pull request. That branch is accumulating fixes and features like crazy.
Hi Ryan, That's weird. I thought I did do this in my last edits to the spp.key file. Perhaps I missed some, but 41 seems like a lot. Is there any easy way to send me the 41 instances, just so I can spot-check that this code is doing what we want. (It looks like it should). Becca
On Fri, Dec 4, 2015 at 10:12 PM, Ryan Batt notifications@github.com wrote:
Has this been corrected subsequently? OK, I looked into it and it has not been corrected.
FYI, here's how I fix:
quoted expressions to hold subsetting logicexp1 <- quote(!is.na(spp) & flag=="BS-batch" & (taxLvl%in%c("species", "genus", "subspecies")))exp2 <- quote(sapply(strsplit(spp, " "), '[', 1)) # will be used twice
spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # show cases spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), genus:=eval(exp2)] # fix spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # should show nothing
I found 41 instances. I think this takes care of problem, let me know otherwise. Will commit this fix on development and put pull request. That branch is accumulating fixes and features like crazy.
— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawlData/issues/15#issuecomment-162135107.
Becca Selden PhD Student in Ecology, Evolution, and Marine Biology University of California, Santa Barbara selden@lifesci.ucsb.edu 320-339-0169 selden@lifesci.ucsb.edu
If have to run the code again. I didn't save the changes.
On Saturday, December 5, 2015, bselden notifications@github.com wrote:
Hi Ryan, That's weird. I thought I did do this in my last edits to the spp.key file. Perhaps I missed some, but 41 seems like a lot. Is there any easy way to send me the 41 instances, just so I can spot-check that this code is doing what we want. (It looks like it should). Becca
On Fri, Dec 4, 2015 at 10:12 PM, Ryan Batt <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:
Has this been corrected subsequently? OK, I looked into it and it has not been corrected.
FYI, here's how I fix: r
quoted expressions to hold subsetting logicexp1 <- quote(!is.na(spp)
& flag=="BS-batch" & (taxLvl%in%c("species", "genus", "subspecies")))exp2 <- quote(sapply(strsplit(spp, " "), '[', 1)) # will be used twice spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # show cases spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), genus:=eval(exp2)] # fix spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2))] # should show nothing
I found 41 instances. I think this takes care of problem, let me know otherwise. Will commit this fix on development and put pull request. That branch is accumulating fixes and features like crazy.
— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawlData/issues/15#issuecomment-162135107.
Becca Selden PhD Student in Ecology, Evolution, and Marine Biology University of California, Santa Barbara selden@lifesci.ucsb.edu javascript:_e(%7B%7D,'cvml','selden@lifesci.ucsb.edu'); 320-339-0169 <selden@lifesci.ucsb.edu javascript:_e(%7B%7D,'cvml','selden@lifesci.ucsb.edu');>
— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawlData/issues/15#issuecomment-162213259.
If I do library("trawlData")
, then use exp1
and exp2
in the above code, then I can run/ get the following:
> spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), list(ref, spp, common, taxLvl, species, genus)] # show cases
ref spp common taxLvl species genus
1: BARBATIA DOMIGESIS Acar domingensis NA species Acar domingensis Barbatia
2: GRIMATROCTES Bathytroctes NA genus NA Grimatroctes
3: GRIMATROCTES BULLISI Bathytroctes microlepis NA species Bathytroctes microlepis Grimatroctes
4: PARADIPLOGRAMMUS BAIRDI Callionymus bairdi NA species Callionymus bairdi Paradiplogrammus
5: CALLISTA EUCYMATA Callpita eucymata NA species Callpita eucymata Callista
6: DACTYLOMETRA QUIQUECIRRHA Chrysaora quinquecirrha NA species Chrysaora quinquecirrha Dactylometra
7: BAIRDIELLA BATABAA Corvula batabana NA species Corvula batabana Bairdiella
8: NEMATONURUS ARMATUS Coryphaenoides armatus NA species Coryphaenoides armatus Nematonurus
9: CORYTHOICHTHYS ALBIROSTRIS Cosmocampus albirostris NA species Cosmocampus albirostris Corythoichthys
10: OSTREA PERMOLLIS Cryptostrea permollis NA species Cryptostrea permollis Ostrea
11: PSEUDOCYPHOMA ITERMEDIUM Cyphoma intermedium NA species Cyphoma intermedium Pseudocyphoma
12: RAJA CLARKII Dactylobatus clarkii NA species Dactylobatus clarkii Raja
13: RAJA OREGOI Dipturus oregoni NA species Dipturus oregoni Raja
14: GOBIOSOMA XATHIPRORA Elacatinus xanthiprora NA species Elacatinus xanthiprora Gobiosoma
15: PODOCHELA GRACILIPES Ericerodes gracilipes NA species Ericerodes gracilipes Podochela
16: CYPRAEA SPURCA Erosaria spurca NA species Erosaria spurca Cypraea
17: OOCORYS BARTSCHI Eucorys bartschi NA species Eucorys bartschi Oocorys
18: MUREX CELLULOSUS Favartia cellulosa NA species Favartia cellulosa Murex
19: HYPSELODORIS EDETICULATA Felimare picta NA species Felimare picta Hypselodoris
20: FUSIUS EUCOSMIUS Fusinus excavatus NA species Fusinus excavatus Fusus
21: LATIRUS CARIIFERUS Hemipolygona carinifera NA species Hemipolygona carinifera Latirus
22: LATIRUS MCGITYI Hemipolygona mcgintyi NA species Hemipolygona mcgintyi Latirus
23: TEREBRA SALLEAA Impages salleana NA species Impages salleana Terebra
24: OPLOPHORUS SPIICAUDA Janicella spinicauda NA species Janicella spinicauda Oplophorus
25: TETRAPTURUS ALBIDUS Kajikia albida NA species Kajikia albida Tetrapturus
26: RAJA LETIGIOSA Leucoraja lentiginosa NA species Leucoraja lentiginosa Raja
27: LIMA PELLUCIDA Limaria pellucida NA species Limaria pellucida Lima
28: SCALPELLUM GIGATEUM Litoscalpellum giganteum NA species Litoscalpellum giganteum Scalpellum
29: LYREIDUS BAIRDII Lysirude nitidus NA species Lysirude nitidus Lyreidus
30: CYPRAEA CERVUS Macrocypraea cervus NA species Macrocypraea cervus Cypraea
31: TURBO CASTAEUS Manzonia crassa NA species Manzonia crassa Turbo
32: MACROCALLISTA MACULATA Megapitaria maculata NA species Megapitaria maculata Macrocallista
33: BATHYLAGUS BERICOIDES Melanolagus bericoides NA species Melanolagus bericoides Bathylagus
34: CYMATIUM KREBSII Monoplex krebsii NA species Monoplex krebsii Cymatium
35: MITHRAX ACUTICORIS Nemausa acuticornis NA species Nemausa acuticornis Mithrax
36: ACTAEA RUFOPUCTATA Paractaea rufopunctata NA species Paractaea rufopunctata Actaea
37: IOGLOSSUS CALLIURUS Ptereleotris calliura NA species Ptereleotris calliura Ioglossus
38: TRIVIA PEDICULUS Pusula pediculus NA species Pusula pediculus Trivia
39: UROLOPHUS JAMAICECIS Urobatis jamaicensis NA species Urobatis jamaicensis Urolophus
40: MUREX CABRITTI Vokesimurex cabritii NA species Vokesimurex cabritii Murex
41: YARELLA BLACKFORDI Yarrella blackfordi NA species Yarrella blackfordi Yarella
ref spp common taxLvl species genus
Yep, those were all ones that I changed in my last update to the taxonomy file. But, I was essentially doing what your code was doing, so that's fine. What's more concerning was that in that last update, I also changed several of the BS non-batch ones to make sure that those that had the same spp had all of the rest of the info the same (common name, taxonomy etc) if that species already existed in the list. Those won't be as easy a fix to write a function to incorporate.
The version of spp.key that's in my trawlData repo on my computer at home, attached below, does include the changes to the BS-batch genus (and all the other changes I made), dated 11/29/15. I pushed these changes (I thought anyway) with ecc8ab7 https://github.com/rBatt/trawlData/commit/ecc8ab7b0c1720eae3de42deeeb8641a2264b97d
When I push with git, does that not automatically update the files that are in the trawlData package?
Becca
2015-12-05 11:22 GMT-05:00 Ryan Batt notifications@github.com:
If I do library("trawlData"), then use exp1 and exp2 in the above code, then I can run/ get the following:
spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), list(ref, spp, common, taxLvl, species, genus)] # show cases ref spp common taxLvl species genus 1: BARBATIA DOMIGESIS Acar domingensis NA species Acar domingensis Barbatia 2: GRIMATROCTES Bathytroctes NA genus NA Grimatroctes 3: GRIMATROCTES BULLISI Bathytroctes microlepis NA species Bathytroctes microlepis Grimatroctes 4: PARADIPLOGRAMMUS BAIRDI Callionymus bairdi NA species Callionymus bairdi Paradiplogrammus 5: CALLISTA EUCYMATA Callpita eucymata NA species Callpita eucymata Callista 6: DACTYLOMETRA QUIQUECIRRHA Chrysaora quinquecirrha NA species Chrysaora quinquecirrha Dactylometra 7: BAIRDIELLA BATABAA Corvula batabana NA species Corvula batabana Bairdiella 8: NEMATONURUS ARMATUS Coryphaenoides armatus NA species Coryphaenoides armatus Nematonurus 9: CORYTHOICHTHYS ALBIROSTRIS Cosmocampus albirostris NA species Cosmocampus albirostris Corythoichthys10: OSTREA PERMOLLIS Cryptostrea permollis NA species Cryptostrea permollis Ostrea11: PSEUDOCYPHOMA ITERMEDIUM Cyphoma intermedium NA species Cyphoma intermedium Pseudocyphoma12: RAJA CLARKII Dactylobatus clarkii NA species Dactylobatus clarkii Raja13: RAJA OREGOI Dipturus oregoni NA species Dipturus oregoni Raja14: GOBIOSOMA XATHIPRORA Elacatinus xanthiprora NA species Elacatinus xanthiprora Gobiosoma15: PODOCHELA GRACILIPES Ericerodes gracilipes NA species Ericerodes gracilipes Podochela16: CYPRAEA SPURCA Erosaria spurca NA species Erosaria spurca Cypraea17: OOCORYS BARTSCHI Eucorys bartschi NA species Eucorys bartschi Oocorys18: MUREX CELLULOSUS Favartia cellulosa NA species Favartia cellulosa Murex19: HYPSELODORIS EDETICULATA Felimare picta NA species Felimare picta Hypselodoris20: FUSIUS EUCOSMIUS Fusinus excavatus NA species Fusinus excavatus Fusus21: LATIRUS CARIIFERUS Hemipolygona carinifera NA species Hemipolygona carinifera Latirus22: LATIRUS MCGITYI Hemipolygona mcgintyi NA species Hemipolygona mcgintyi Latirus23: TEREBRA SALLEAA Impages salleana NA species Impages salleana Terebra24: OPLOPHORUS SPIICAUDA Janicella spinicauda NA species Janicella spinicauda Oplophorus25: TETRAPTURUS ALBIDUS Kajikia albida NA species Kajikia albida Tetrapturus26: RAJA LETIGIOSA Leucoraja lentiginosa NA species Leucoraja lentiginosa Raja27: LIMA PELLUCIDA Limaria pellucida NA species Limaria pellucida Lima28: SCALPELLUM GIGATEUM Litoscalpellum giganteum NA species Litoscalpellum giganteum Scalpellum29: LYREIDUS BAIRDII Lysirude nitidus NA species Lysirude nitidus Lyreidus30: CYPRAEA CERVUS Macrocypraea cervus NA species Macrocypraea cervus Cypraea31: TURBO CASTAEUS Manzonia crassa NA species Manzonia crassa Turbo32: MACROCALLISTA MACULATA Megapitaria maculata NA species Megapitaria maculata Macrocallista33: BATHYLAGUS BERICOIDES Melanolagus bericoides NA species Melanolagus bericoides Bathylagus34: CYMATIUM KREBSII Monoplex krebsii NA species Monoplex krebsii Cymatium35: MITHRAX ACUTICORIS Nemausa acuticornis NA species Nemausa acuticornis Mithrax36: ACTAEA RUFOPUCTATA Paractaea rufopunctata NA species Paractaea rufopunctata Actaea37: IOGLOSSUS CALLIURUS Ptereleotris calliura NA species Ptereleotris calliura Ioglossus38: TRIVIA PEDICULUS Pusula pediculus NA species Pusula pediculus Trivia39: UROLOPHUS JAMAICECIS Urobatis jamaicensis NA species Urobatis jamaicensis Urolophus40: MUREX CABRITTI Vokesimurex cabritii NA species Vokesimurex cabritii Murex41: YARELLA BLACKFORDI Yarrella blackfordi NA species Yarrella blackfordi Yarella ref spp common taxLvl species genus
— Reply to this email directly or view it on GitHub https://github.com/rBatt/trawlData/issues/15#issuecomment-162218011.
Becca Selden PhD Student in Ecology, Evolution, and Marine Biology University of California, Santa Barbara selden@lifesci.ucsb.edu 320-339-0169 selden@lifesci.ucsb.edu
Well, pushing to the repo updates the .csv file on the repo. But that's different from the package. The package references a .RData file in another folder. I wrote a small function to read in the csv, convert characters to ASCII, and save the new .RData file for the package.
I haven't pushed my current version of the repo yet.
I'm going to do
git checkout ecc8ab7b0c1720eae3de42deeeb8641a2264b97d -- inst/extdata/taxonomy/spp.key.csv
That command will pull in your version of the file from Git history. Then I can update the spp.key and check for those genera again.
We'll figure it out. It's possible that I did something dumb last night, I was pretty tired by the end of it :stuck_out_tongue_winking_eye: The great part about Git though is that we don't have to worry about losing anything. It's all saved.
More updates soon.
OK, I just did that, and this is the output:
> spp.key[eval(exp1) & (is.na(genus) | genus!=eval(exp2)), list(ref, spp, common, taxLvl, species, genus)] # show cases
ref spp common taxLvl species genus
1: CYMATIUM KREBSII Monoplex krebsii NA species Monoplex krebsii Cymatium
So it's saying you only missed 1. Which sounds reasonable?
I'll go ahead and make this fix directly on master, merge the changes into development, then pull development back into master.
After that, everything should the same everywhere, and everything should be up to date.
I had merged the master versions of the .RData and .csv spp.key files into the development branch in a "take theirs" fashion using git checkout master --theirs -- inst/extdata/taxonomy/spp.key.csv
and 1git checkout master --theirs -- data/spp.key.RData
.
I am also doing library(remake); make()
and library(devtools); document(); check(); unload(); install();
to get a clean install and update of the package.
OK, this has been taken care of. All up to date.
Sorry it took a while, I walked away from my computer for a while while it reinstalled!
I'll pair the closure of this message with a release.
@rbatt
1 For those species with an NA conflict field, and a BS-batch flag (from the batch download I did from WORMS), the spp will show the accepted name. But if this is different from the species it matched in ref, the genus will still be the old genus.
Example: ref=BARBATIA DOMIGESIS species that was matched in the database (does not appear in file)=Barbatia domingensis spp=accepted name=Acar domingensis species=Acar domingensis genus=Barbatia
See http://www.marinespecies.org/aphia.php?p=taxdetails&id=582484
Will need to subset the data by the BS-batch flag, create a temporary genus column that is a split of spp, then run something along the lines of ifelse(genus.temp == genus, genus, genus.temp)