Closed GoogleCodeExporter closed 9 years ago
Thanks Ben. assigning to Natasha.
There seem to be problems in the BIE with the genus Grevillea.
Original comment by moyesyside
on 24 Feb 2014 at 7:10
Some notes:
Grevillea seems to be loaded ok
http://bie.ala.org.au/species/urn:lsid:biodiversity.org.au:apni.taxon:378603
but searches result in some errors
http://bie.ala.org.au/search?q=Grevillea
Original comment by moyesyside
on 24 Feb 2014 at 7:14
This has been fixed with the latest release. There was an issue where the bulk
lookup was not preferring the "exact" name over other matches.
We have not inserted empty records for non-matched names yet. I am not sure of
the regression effects on other apps that are using the service.
Original comment by natasha....@csiro.au
on 28 Feb 2014 at 5:55
I suggest we make it an additional boolean request parameter to include empty
records for the non-matched name. We can default this to false (dont include
empty records) to avoid regression problems.
Original comment by moyesyside
on 28 Feb 2014 at 8:06
I think that we will need to version the API. At the moment this webservice
expects the entire request body to be an array of names. We can not add extra
params to it without changing the format to a JSON map of params.
What I propose is having 2 WS running together with version included in URI so
that both calls will work.
Original comment by natasha....@csiro.au
on 2 Mar 2014 at 10:45
I have released a test bie-service instance that has a new service that will
insert a null value when the lookup does not return a value. Thus the order in
which the results appear correspond to the order of the supplied list. The
format of the JSON request body is slightly different.
Test webservice (please be aware that there may be some performance issues on
the test server)
http://118.138.243.151/bie-service/ws/species/lookup/bulk
Example JSON body:
{"names":"[\"Grevillea humilis\",\"Macropus rufus\", \"ZZZ nnn\"]"}
Here is the example:
http://apikitchen.com/#dE41G
Original comment by natasha....@csiro.au
on 3 Mar 2014 at 3:20
Thanks for getting stuck into this. A few comments:
1. Test instance works for me, although the names component of the JSON body
now needs to contain a string representing a JSON array, rather than an actual
JSON array. Wouldn't it make more sense to have it as
{"names":["name1","name2"]}?
2. If a versioned API is going to create more headaches than it solves, is it
sensible to think about allowing a URL parameter to the original web service to
control the null-record behaviour? e.g.
http://bie.ala.org.au/ws/species/bulklookup.json?includenull=true (and then
with the POSTed JSON body the same as before)
3. Still not convinced that the name-matching logic is working as it should.
e.g. for "Macropus rufus" I get the right name back, but for "Macropus rufu" I
get Marsilea macropus, with a score of 0.04. Surely "Macropus rufus" is a
better match to "Macropus rufu"?
Thanks
B
Original comment by antarcti...@gmail.com
on 3 Mar 2014 at 6:10
Thanks Ben.
1. Agree
2. My understanding is that this would work for some http servers but is not a
recommended approach in the HTTP spec.
3. This is beginning to stretch the API in new ways. These calls weren't really
intended to support fuzzy matches or matches for partial names. Is this really
useful in a bulk context ?
Original comment by moyesyside
on 3 Mar 2014 at 9:38
Re 3: my misunderstanding, perhaps. I confess that originally I wasn't
expecting the bulklookup to give matches to partial names. It isn't necessary
(for the R stuff) for it to do so - just exact-matches is fine. But currently
this service *is* doing fuzzy/incomplete matching. If it's going to do it, then
surely it should do so sensibly! My concern is that a user might submit a bunch
of names including (say) "Macropus rufu", and not catch the fact that it's been
matched to something totally different. (Yes they should check, but I'd argue
that this particular example isn't a reasonably-expected result for a
name-matching service). Does it make more sense to only return exact matches?
(Or again, will that potentially break backwards compatibility with existing
users?)
Original comment by antarcti...@gmail.com
on 3 Mar 2014 at 10:07
thank Ben. I agree. We should just support exact matches (with synonym
resolution) and return null where an exact match wasn't found. Backwards
compatibility shouldn't be an issue as this is a brand new URL path not in use.
Original comment by moyesyside
on 3 Mar 2014 at 10:17
Yep, that'll work for us. Ta.
Original comment by antarcti...@gmail.com
on 3 Mar 2014 at 10:19
OK changes made and deployed to the test server.
Webservice now accepts JSON params correctly:
{"names":["Grevillea humilis","Macropus rufus", "ZZZ nnn","Macropus rufus
(Desmarest, 1822)"]}
Also matches should be better.
Original comment by natasha....@csiro.au
on 5 Mar 2014 at 5:49
All looks good to me! Thanks.
One remaining issue that may or may not make sense to address is names that can
match multiple LSIDs. e.g. "Oenanthe" can match either birds or plants. I don't
know if it makes sense to return all matches in this case? (e.g. the returned
data structure would be an array of arrays). An alternative would be to add a
"is_unique" column so that potential exact-but-incorrect matches can be flagged
for the user to sort out via some other service.
Original comment by antarcti...@gmail.com
on 7 Mar 2014 at 4:01
Ah, me again. Another problem, I think.
In searching for "Grevillea" I'm getting "Grevillea banksii" as the returned
name. I'd expect this query to give me "Grevillea" the genus. I'm pretty sure
that this is happening because the matching is happening on common names as
well as scientific names. Searching on "red kangaroo" gets me "Macropus rufus".
The API says that this service takes a list of scientific names, it doesn't
mention common names. Should it then be matching on common names? I think this
is the cause of the problem.
Original comment by antarcti...@gmail.com
on 8 Mar 2014 at 4:44
OK so we have change the behaviour of the new bulklookup. By default we are
not allowing common name matches. Common name matches can be included (if
necessary) via the following JSON body:
{"names":["Grevillea"],"vernacular":true}
This will not limit it to only common name matches, rather it replicates the
original behaviour for regression use.
This change has been deployed to the test server.
Original comment by natasha....@csiro.au
on 2 Apr 2014 at 12:17
released
Original comment by natasha....@csiro.au
on 2 Apr 2014 at 2:13
Not sure that the vernacular option is working. When I try it, I get a
400 error with:
Format of input incorrect: Unexpected character ('f' (code 102)): was
expecting double-quote to start field name
Same code without the vernacular part works fine.
(Same behaviour on both test and live server)
Original comment by antarcti...@gmail.com
on 2 Apr 2014 at 7:01
Im guessing you arent setting a content type of application/json in the request.
Heres an example
http://apikitchen.com/#tkcW9
Original comment by moyesyside
on 2 Apr 2014 at 7:50
Turned out it was because my conversion to JSON was writing vernacular
as an array (i.e. "vernacular":[true] not "vernacular":true). Fixed at
my end now.
Original comment by antarcti...@gmail.com
on 2 Apr 2014 at 8:33
Original issue reported on code.google.com by
antarcti...@gmail.com
on 22 Feb 2014 at 10:59