Closed glaroc closed 7 years ago
thanks @glaroc !
Two issues:
content
field in the returned data. Going with your example taxon, here's the first five results from content
and title
fields> vapply(res$results[1:5], "[[", "", "content")
[1] "Acer saccharum var. floridanum (Chapm.) Small & A. Heller; Acer saccharum var. floridanum Small & A. Heller"
[2] "Acer barbatum Michx.; Acer barbatum; Acer floridanum; Acer saccharum floridanum; Acer floridanum (Chapm.) Pax; Acer saccharum subsp. floridanum; Acer floridanum var. longii Fernald; Acer floridanum Pax"
[3] "Acer barbatum Michx.; Acer floridanum (Chapman) Pax; Saccharodendron barbatum (Michx.) Nieuwl.; Saccharodendron floridanum (Chapman) Nieuwl.; Acer saccharinum var. floridanum Chapman; Acer barbatum var. longii (Fern.) Fern.; Acer barbatum var. villipes (Rehd.) Ashe; Acer floridanum var. longii Fern.; Acer floridanum var. villipes Rehd.; Acer nigrum var. floridanum (Chapman) Fosberg; Acer saccharum var. floridanum (Chapman) Small & Heller; Acer barbatum; Acer saccharum subsp. floridanum (Chapm.) Desmarais; Acer saccharinum var. floridanum Chapm.; Saccharodendron floridanum (Chapm.) Nieuwl.; Acer saccharum subsp. floridanum (Chapman) Desmarais; Acer floridanum var. longii Fernald; Acer barbatum var. longii (Fernald) Fernald; Acer saccharum ssp. floridanum (Chapm.) Desmarais; Acer barbatum var. villipes (Rehder) Ashe; Acer floridanum var. villipes Rehder; Acer nigrum var. floridanum (Chapm.) Fosberg"
[4] "Acer nigrum Michx. f.; Acer nigrum; Acer saccharum nigrum; Acer nigrum F. Michx.; Acer saccharum subsp. nigrum; Acer nigrum F.Michx. (1812); Acer saccharum var. nigrum Britton"
[5] "Saccharodendron nigrum (Michx. f.) Small; Acer saccharum var. viride (Schmidt) E. Murr.; Acer nigrum var. palmeri Sarg.; Acer saccharum var. nigrum (Michx. f.) Britt.; Acer nigrum; Acer saccharum subsp. nigrum (F. Michx.) Desmarais; Acer saccharum subsp. nigrum (Michx. f.) Desmarais; Acer nigrum Michx.; Acer saccharum subsp. nigrum (Michx.) Desmarais; Saccharodendron nigrum (F. Michx.) Small; Acer saccharum ssp. nigrum (F. Michx.) Desmarais; Acer saccharum var. viride (Schmidt) E. Murray; Acer saccharum var. nigrum (F. Michx.) Britton"
> vapply(res$results[1:5], "[[", "", "title")
[1] "Acer floridanum (Chapm.) Pax" "Acer floridanum (Chapm.) Pax" "Acer floridanum (Chapm.) Pax" "Acer nigrum F. Michx." "Acer nigrum F. Michx."
I'm not sure what the content
field contains exactly, but it's much more text than the title
field.
tryCatch
to fix thatI was thinking that the content field is more appropriate because the only entries that contain the actual species name "acer saccharum" in that field are the ones that reference the proper page id (582247). Maybe this doesn't apply to all situations however.
i'm not sure i follow. can you clarify
In the json output (http://eol.org/api/search/1.0.json?q=acer+saccharum&page=1&exact=false&filter_by_taxon_concept_id=&filter_by_hierarchy_entry_id=&filter_by_string=&cache_ttl=false), the entries with "id":582247 are the correct ones, and they also contain the correct species name with not attribution in the content field. Doing a search for Acer saccharum on the EOL website also returns page 582247.
the sci2comm('Acer saccharum')
example should be fixed now, reinstall and try again
Yes, that works!
we should probably just return the output of content
field as well as the link field, so eol_search
would do e.g,.
x <- eol_search('Acer saccharum')
str(x)
#> 'data.frame': 23 obs. of 4 variables:
#> $ pageid : int 583023 583023 583023 596825 596825 583022 583021 583021 1245035 1249734 ...
#> $ name : chr "Acer floridanum (Chapm.) Pax" "Acer floridanum (Chapm.) Pax" "Acer floridanum (Chapm.) Pax" "Acer nigrum F. Michx." ...
#> $ link : chr "http://eol.org/583023?action=overview&controller=taxa" "http://eol.org/583023?action=overview&controller=taxa" "http://eol.org/583023?action=overview&controller=taxa" "http://eol.org/596825?action=overview&controller=taxa" ...
#> $ content: chr "Acer saccharum var. floridanum (Chapm.) Small & A. Heller; Acer saccharum var. floridanum Small & A. Heller" "Acer barbatum Michx.; Acer barbatum; Acer floridanum; Acer saccharum floridanum; Acer floridanum (Chapm.) Pax; "| __truncated__ "Acer barbatum Michx.; Acer floridanum (Chapman) Pax; Saccharodendron barbatum (Michx.) Nieuwl.; Saccharodendron"| __truncated__ "Acer nigrum Michx. f.; Acer nigrum; Acer saccharum nigrum; Acer nigrum F. Michx.; Acer saccharum subsp. nigrum;"| __truncated__ ...
i'm not sure what else can be done though since the content
field is pretty variable in what it contains, sometimes the first name has no authority, sometimes it does, sometimes it matches the entry in title
, sometimes it doesn't. And there's no metadata as to what the different semi-colon entries within content
represent, and I don't see any EOL docs that explain it
Well, if the original bug with sci2comm('Acer saccharum') is not related to the use of the title vs content field, I'm not sure anything needs to be done with the content field.
Okay, i might still return the other fields link
and content
For example, a search for
sci2comm('Acer saccharum')
returns no results. This seems to be due to the fact thateol_search(terms = 'Acer saccharum')
does not contain an exact match since attributions are added to the species name. From my understanding, species listed in the json output of the api call often contain the author in the "title" field, while the plain species names are listed in the content field. In this case, page 582247 is the correct one.