danielabar / globi-proto

InfoVis 2015 IVMOOC Globi Explorer
http://danielabar.github.io/globi-proto
MIT License
2 stars 1 forks source link

Humans (Homo sapiens) eat salmon (Salmonidae), but Salmons are not eaten by humans? #1

Closed jhpoelen closed 9 years ago

jhpoelen commented 9 years ago

I think the navigation tool looks very nice already: I was able to browse around the interaction webs quite easily.

I did find something strange. When selecting a "eats" and Homo sapiens, I found that humans eat salmon (Salmonidae). After clicking on salmon and selecting "eatenBy", I found that salmon was eaten by bald eagles only. I would expect humans to be in the list also.

Underlying GloBI (see http://globalbioticinteractions.org) api call like: http://api.globalbioticinteractions.org/taxon/Homo%20sapiens/eats or http://api.globalbioticinteractions.org/interaction?sourceTaxon=Homo%20sapiens&interactionType=eats

and

http://api.globalbioticinteractions.org/taxon/Salmonidae/eatenBy or http://api.globalbioticinteractions.org/interaction?sourceTaxon=Salmonidae&interactionType=eatenBy

did include humans in consumers of salmon.

@danielabar Could you share how you are building up the api calls? Perhaps I can help solve the problem.

danielabar commented 9 years ago

The reason for this result is it seems the tabular json format returns differently formatted responses, and the code handles just one of them. For example:

What do humans eat? The code constructs this url: http://api.globalbioticinteractions.org/taxon/Homo%20sapiens/eats

Given this format of API response, the actual list of species that humans eat is parsed with the following code: var speciesList = response.data[0][2];

And many of the API responses seem to work like this.

However, for the other case "What does a salmon eat", this url is constructed: http://api.globalbioticinteractions.org/taxon/Salmonidae/eatenBy

Notice this time, the response is formatted differently such that the parsing that worked for "what do humans eat" will not work for "what does salmon get eaten by".

Would it be possible for the API to return a consistent response for all the interaction types? Or is there a different query structure I should use to get consistent responses?

danielabar commented 9 years ago

I did a little more investigation, if I append type=json.v2 to the queries, then the responses do appear to be more consistent, which seems to solve the problem, and as a bonus, makes the parsing code a little easier to read, as compared to tabular format.

I know we had discussed earlier that the json.v2 format is a little more "chatty", but it doesn't seem to be a problem for an app this relatively small scale.

One interesting thing though, now I'm noticing duplicates in the species target list. I can solve this by filtering dupes on the client side.

jhpoelen commented 9 years ago

You mention that http://api.globalbioticinteractions.org/taxon/Homo%20sapiens/eats and http://api.globalbioticinteractions.org/taxon/Salmonidae/eatenBy produce a different result. I think I might know what the confusion is about.

First first url point to a specific species, Homo sapiens, so the resulting interactions will include a single data row with a list of food items:

{
  "columns" : [ "source_taxon_name", "interaction_type", "target_taxon_name" ],
  "data" : [ [ "Homo sapiens", "eats", [ "Kinixys erosa", "Cricetomys emini", "Atherurus africanus", "Kinixys", "Cephalophus monticola", "Cercopithecus pogonias", "Cercopithecus nictitans", "Cercopithecus mona", "Civettictis civetta", "Nandinia binotata", "Manis tricuspis", "Eidolon helvum", "Bos taurus", "Bos grunniens", "Bos sauveli", "Salmonidae", "Rhinolophus inops", "Chaetophractus nationi", "Decapoda", "Morone saxatilis", "Pomatomus saltatrix", "Scombridae", "demersal species", "Rupicapra rupicapra", "Cyclura cornuta", "Ovis ammon", "Pleuronectiformes", "Phalanger lullulae", "Odocoileus virginianus", "Sus verrucosus", "Alces alces", "Osphronemidae", "Cervus nippon", "Redunca arundinum", "Redunca fulvorufula", "Bivalvia", "Saguinus bicolor", "Monachus tropicalis", "Hippotragus niger", "Alligator mississippiensis", "Cervus unicolor", "Cervus alfredi", "Neophoca cinerea", "Cervus eldii", "Cervus albirostris", "Arctocephalus australis", "Lepus nigricollis", "Vulpes vulpes", "Sus celebensis", "Fundulus heteroclitus", "Allocebus trichotis", "Manis javanica", "Loxodonta africana", "Genetta piscivora", "Petrodromus tetradactylus", "Avahi laniger", "Paguma larvata", "Cyprinus carpio", "Scaridae", "Eulemur rubriventer", "Ursus americanus", "Ursus arctos", "Ursus maritimus", "Eubalaena glacialis", "Enhydra lutris", "Capra ibex", "Actinopterygii", "Phoca", "Gadidae", "Melanogrammus aeglefinus", "Sylvilagus floridanus", "Astarte arctica", "Pollachius pollachius", "Peprilus triacanthus", "Venustaconcha ellipsiformis", "Balaena mysticetus", "Cichlidae", "Alosa pseudoharengus", "Scomber", "Clupea harengus", "Ostreoida", "Pygathrix nemaeus", "Hippoglossus hippoglossus", "Mustelus canis", "Pseudopleuronectes americanus", "Hippoglossoides platessoides", "Monodon monoceros", "Glyptocephalus cynoglossus", "Hippoglossina oblonga", "Equus kiang", "Scophthalmus aquosus", "Paralichthys dentatus", "Limanda ferruginea", "Aythya americana", "Salmo", "Delphinapterus leucas", "Sander vitreus", "Scomber japonicus", "Todarodes pacificus", "Stenella", "Engraulis japonicus", "Trachurus japonicus", "Physeter macrocephalus", "Mirounga leonina", "Plecotus austriacus", "Oreochromis niloticus", "Lepisosteus platostomus", "Galago alleni", "Siganidae", "Acanthuridae", "Castor canadensis", "Canis lupus dingo", "Hippopotamidae", "Barbus", "Tilapia zillii", "Artocarpus altilis", "Hypsignathus monstrosus", "Phasianidae", "Cyrtosperma", "Pandanus", "Cheloniidae", "Sus scrofa", "Arecaceae", "Birgus latro", "Colobus angolensis", "Macaca sylvanus", "Gazella gazella", "Chondrichthyes", "Istiophoridae", "Phocidae", "Cynoscion", "Lophius americanus", "Squalus acanthias", "Ammodorcas clarkei", "Procolobus rufomitratus", "Procolobus verus", "Beatragus hunteri", "Colobus satanas", "Bubalus quarlesi", "Arapaima gigas", "Raphicerus melanotis", "Raphicerus sharpei", "Lophocebus albigena", "Cyrtonyx montezumae", "Patellogastropoda", "Echinoidea", "Ailuropoda melanoleuca", "Crateromys schadenbergi", "Mazama gouazoupira", "Cebus olivaceus", "Branta canadensis", "Didelphis virginiana", "Leporidae", "Bonasa umbellus", "Saguinus nigricollis", "Mydaus marchei", "Mustelinae", "Anatidae", "Conepatus chinga", "Brachyteles arachnoides", "Crocodylia", "Aves", "Lontra provocax", "Moschus chrysogaster", "Pryola", "Vulpes cana", "Myotis myotis", "Mysticeti", "Phoebastria nigripes", "Muntiacus vuquangensis", "Bassaricyon gabbii", "Leopardus geoffroyi", "Scorpaenichthys marmoratus", "Rajiformes", "Helianthus", "Sterna", "Bos javanicus", "Martes melampus", "Pteropus samoensis", "Sergia lucens", "Anas strepera", "Anas acuta", "Dendrolagus scottae", "Oreochromis leucostictus", "Macropus bernardus", "Octopoda" ] ] ]
}

However, for the result of the question who eats salmon, Salmonidae, various taxa that are part of the salmon family are selected. Now, the result will include multiple rows, describing food items for each taxon that is part of the salmon family:

{
  "columns" : [ "source_taxon_name", "interaction_type", "target_taxon_name" ],
  "data" : [ [ "Salvelinus namaycush", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Salvelinus", "eatenBy", [ "Salvelinus namaycush", "Haliaeetus leucocephalus" ] ], [ "Prosopium cylindraceum", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Coregonus", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Oncorhynchus mykiss", "eatenBy", [ "Pandion haliaetus", "Salvelinus fontinalis", "Salmo trutta", "Oncorhynchus aguabonita", "Oncorhynchus mykiss" ] ], [ "Oncorhynchus tshawytscha", "eatenBy", [ "Laridae", "Orcinus orca", "Otariidae", "Phalacrocoracidae", "Morone saxatilis", "Ursinae", "Alosa sapidissima", "Lutrinae", "Cottoidei", "Accipitridae", "Phocidae", "Sebastes melanops" ] ], [ "Oncorhynchus clarkii", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Coregonus albula", "eatenBy", [ "Sander vitreus" ] ], [ "Coregonus hoyi", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Prosopium williamsoni", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Salmonidae", "eatenBy", [ "Homo sapiens", "Esocidae", "Haliaeetus leucocephalus" ] ], [ "Salmo trutta", "eatenBy", [ "Ardea cinerea", "Lutra lutra", "Phalacrocorax carbo", "Larus argentatus", "Salvelinus fontinalis", "Salmo trutta", "Oncorhynchus aguabonita", "Oncorhynchus mykiss" ] ], [ "Salmo salar", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Salmo trutta fario", "eatenBy", [ "Natrix tessellata" ] ], [ "Oncorhynchus aguabonita", "eatenBy", [ "Salvelinus fontinalis", "Salmo trutta", "Oncorhynchus aguabonita", "Oncorhynchus mykiss" ] ], [ "Salmo", "eatenBy", [ "Homo sapiens", "Mustela" ] ], [ "Salvelinus fontinalis", "eatenBy", [ "Salvelinus fontinalis", "Salmo trutta", "Oncorhynchus aguabonita", "Oncorhynchus mykiss" ] ], [ "Salvelinus confluentus", "eatenBy", [ "Aves", "Salvelinus namaycush" ] ], [ "Oncorhynchus nerka", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Salvelinus malma", "eatenBy", [ "Haliaeetus leucocephalus" ] ], [ "Oncorhynchus gorbuscha", "eatenBy", [ "Haliaeetus leucocephalus" ] ] ]
}

When using the code var speciesList = response.data[0][2], this would only select the first data row [ "Salvelinus namaycush", "eatenBy", [ "Haliaeetus leucocephalus" ] ] which describes that an American Lake Char (Salvelinus namaycush) is eaten by a bold eagle (Haliaeetus leucocephalus). So, in order to get all the results, you'd have to include all data rows, not just the first one.

Does this make sense?

jhpoelen commented 9 years ago

PS This also explains the target taxon duplicates: if two kinds of salmon eat the same thing, then that same thing would occur twice in the results.

danielabar commented 9 years ago

The problem is I don't know for any given taxon where it is on the chain, so given any particular query in the format: /taxon/:taxon/:interaction

I need a consistent response to parse because the same code will handle it. Using json.v2 seems to make the response more consistent. The dupes could be filtered client side.

But the bigger issue may be, how to know where in the taxon chain any given result is? Is this data exposed in any of the api calls?

jhpoelen commented 9 years ago

This data is exposed using the source_taxon_path and target_taxon_path fields. Here the list of all available fields: http://api.globalbioticinteractions.org/interactionFields . You can select fields using one or more "field" query parameter.

Example: http://api.globalbioticinteractions.org/interaction?sourceTaxon=Salmonidae&interactionType=eatenBy&field=source_taxon_path&field=source_taxon_path_ids&field=source_taxon_path_ranks&field=target_taxon_path&field=target_taxon_path_ids&field=target_path_ranks

Would it help if I also make this work for the taxon/[source taxon]/[interactionType] urls syntax ?

Something like: http://api.globalbioticinteractions.org/taxon/Salmonidae/eatenBy?field=source_taxon_path&field=source_taxon_path_ids&field=source_taxon_path_ranks&field=target_taxon_path&field=target_taxon_path_ids&field=target_path_ranks

Thanks for being patient!

jhpoelen commented 9 years ago

PS Perhaps easier would be to use http://api.globalbioticinteractions.org/findTaxon/Homo%20sapiens . This returns the taxon path by default.

{"path":"Animalia | Bilateria | Deuterostomia | Chordata | Vertebrata | Gnathostomata | Tetrapoda | Mammalia | Theria | Eutheria | Primates | Haplorrhini | Simiiformes | Hominoidea | Hominidae | Homininae | Homo | Homo sapiens","commonNames":"إنسان @ar | Insan @az | човешки @bg | মানবীয় @bn | Ljudsko biće @bs | Humà @ca | Muž @cs | Menneske @da | Mensch @de | ανθρώπινο ον @el | Humans @en | Humano @es | Gizakiaren @eu | Ihminen @fi | Homme @fr | Mutum @ha | אנושי @he | մարդու @hy | Umano @it | ადამიანის @ka | Homo @la | žmogaus @lt | Om @mo | Mens @nl | Òme @oc | Om @ro | Человек разумный современный @ru | Qenie Njerëzore @sq | மனிதன் @ta | మానవుడు @te | Aadmi @ur | umuntu @zu | ","name":"Homo sapiens","externalId":"EOL:327955"}
danielabar commented 9 years ago

That's neat, I'm currently using findTaxon to get the English common name for the typeahead, didn't realize it also contains path information, that could come in handy.

As for appending multiple field parameters to queries, yes it would be helpful to have it available on the /taxon/:taxon/:interaction url syntax as well. However, (going off on a tangent from this issue), would it be possible to have a single "fields" parameter that takes say, a comma separated list of field names.

For example: http://apiurl/taxon/:taxon/:interaction?fields=source_taxon_path,target_taxon_path,study_url,latitude,longitude etc.

Reason I mention it, is I'm using Angular $resource which serves as a convenient abstraction around $http for interacting with RESTful resources.

And the way its used is you build a JavaScript object, then call 'get' on the query/resource. And it uses the keys in the javascript object to dynamically replace the ":" variables in the url syntax. Any additional keys not matched in the url syntax are then dynamically appended as query parameters.

However, it is not possible to construct a javascript object with multiple same named keys.

danielabar commented 9 years ago

Fixed, what do humans eat: http://danielabar.github.io/globi-proto/#/main?name=Homo%20sapiens&interaction=eats

Click on Salmon eaten by now displays many results (not just the eagle) http://danielabar.github.io/globi-proto/#/main?name=Salmonidae&interaction=eatenBy

One thing I notice though, and this may be a separate issue, I had to put a sanity check to limit number of results, possibly its too small now at 20. So by chance, Human may not display on list of things that eat Salmon even though it did come back from API response.

Reason for sanity check is because for each response, it requires a separate API call to get the image and common name data. So a list of say, 100 results would trigger 100 api calls. And can crash the browser when list is big enough.

jhpoelen commented 9 years ago

Result looks great! I think to 20 record limit does prevent the humans from showing up for salmon consumers.

Perhaps you might want to introduce pagination. This should be supported in GloBI api using the limit and offset query parameters:

http://api.globalbioticinteractions.org/taxon/Salmonidae/eatenBy?limit=15&offset=2

See https://github.com/jhpoelen/eol-globi-data/wiki/API#pagination .

Another approach could be to filter the results by specifying a target taxa along with the source taxa.

Hope this helps.

danielabar commented 9 years ago

Paging could be a future enhancement for the grid.