Closed cboettig closed 8 years ago
Yuck, this is why its best to always use all lowercase in DB design :smile:
I'll look into it. :+1:
I was just converted to full-time at Vulcan :tada: so I have some new-hire stuff to go through this second, but if you want to take a whack at it yourself I'd check out the Base#endpoint
method here. The result hash keys need to be examined and the Speccode
key changed to SpecCode
.
interesting that we don't get an error when requesting a non-existent field to query (e.g. asking for SpecCode when the field is called Speccode
Weird, in mysql console locally, I can use Speccode
or SpecCode
and I get the same result - Maybe it's something in activerecord? just guessing
It may also be a setting that is specific to your MySQL instance. I know our DevOps guy had to change a bunch of settings to get the FB site to run correctly due to a number of casing issues.
I inspected the schema and FB indeed does use Speccode
in some places and SpecCode
in others. MySQL may allow for case insensitivity when querying, but processes that ingest that data may not be case insensitive. #lowercaseisthebestcase :wink:
All for standardizing on lowercase if possible. And all the better if this can get standardized / sanitized further upstream in the database.
and @EvilScott , :clap: :fireworks: for converting to full time!
missed that, congrats @EvilScott on full time!
Thanks guys :smile:
I'd love to push up for FB to use lowercase in the DB, but looking at the web code they are pretty heavily using camel case in the PHP code when hitting MySQL. So I don't know if we can count on a good standardization happening anytime soon :grimacing:
@sckott Do you have a list handy of tables that use Speccode
(or perhaps other forms of capitalization) instead of SpecCode
? May as well fix that on the rfishbase
client meanwhile.
just a sec
http://fishbase.ropensci.org/listfields?fields=Speccode&exact=true http://fishbase.ropensci.org/listfields?fields=SpecCode&exact=true
Note that not all of these tables has a corresponding model in ActiveRecord.
here's a jq
version using the API
http 'http://fishbase.ropensci.org/listfields' | jq '.data[] | select(.column_name | match("speccode"; "i"))' | jq -s
[
{
"table_name": "abnorm",
"column_name": "SpecCode"
},
{
"table_name": "abundance",
"column_name": "SpecCode"
},
...
ah, of course, listfields
is handy for this, thanks! We'll get these changes into rfishbase
at least.
@sckott @EvilScott Do you think it's worth trying to address this on the API end or shall we just close this? Not really a bug since the keys are different in the database, but certainly a gotcha for anyone trying to navigate the API.
I guess we could lowercase all keys. Do you think that's a bad idea @EvilScott
@cboettig thats true. someone may actually expect things to be the "proper" cases and if we change it we may be screwing up someone else's code :grimacing:
@sckott lowercasing all keys on the API or the R library?
@EvilScott lowercase the keys in the JSON returned from the API
@sckott i think its correct, but its a judgement call that you guys need to make.
@cboettig thoughts on lower-casing keys in all data output by the API? Don't know yet what performance hit we'll take, but hopefully minimal
Probably better to just stick with what the database gives us. Changing the output won't avoid the original problem here (we'd have to do something about mapping input queries), and might be inconsistent if user somehow expect the capitalization used in the underlying data..
I do wonder if we should do something to the https://github.com/ropensci/fishbaseapi/blob/master/models/base.rb#L7 method though so that either SpecCode or Speccode do the same thing. It's the main key for most of the database, and it's kind of silly to expect users / developers to keep track of whether it is Speccode or SpecCode for each and every table. Not sure how best to handle that though. If the SQL query is case-insensitive, maybe we can just re-map both to lowercase there?
@EvilScott been playing with this. Seems that with activerecord, we're fine with queries passed to fields
as case doesn't matter there, but on the where
statement, it does matter. Can we implement a LIKE
statement with activerecord::where
? From my googling, that seems like it would be case insensitive.
@sckott I'm confused. I thought the issue was the inconsistencies in casing returned from the DB, not querying the DB. Where do you need to implement a LIKE statement?
I think we have problems in querying, not just data returned
In the where
statement, it is case sensitive, so the call
http://fishbase.ropensci.org/maturity?SpecCode=4
doesn't actually work as the where statement returns data, but as you can see the results don't match 4
Actually it does appear that select
statement to return certain fields is case sensitive too, e.g.,
http://fishbase.ropensci.org/maturity?fields=SpecCode
doesn't return just Speccode
- whereas the right case works:
Hrm, interesting. That adds another wrinkle
OK so first part is the issue with the WHERE clause. We are using params that are included in the column list. Here is the code. The kicker is Array#include?
is case sensitive:
1.9.3-p551 :002 > %w(foo BAR baz).include? "bar"
=> false
So we may still be able to use case insensitive in our querying, but with that select
block it may be excluding valid params.
For the SELECT clause I'd guess a similar issue is here in the place where we are making a list of fields. Off chance: it may also have to do with wrapping the field name in backticks.
Thanks, that's helpful.
it may also have to do with wrapping the field name in backticks.
Right, we did that b/c of https://github.com/ropensci/fishbaseapi/issues/79#issuecomment-157890188
I guess we wouldn't need backticks if there weren't reserved words in the tables, but we can't change that
I just tested backticks with my local MySQL DB and it doesn't seem to make a difference.
select `classnum` from classes limit 10;
returns the same rows as:
select `ClassNum` from classes limit 10;
The only difference is the column name of the result, classnum
vs ClassNum
. Hopefully this means if we make the #include?
check case insensitive (easiest to accomplish by downcasing the needle and the haystack both) then this should solve both problems.
is this most efficient x.collect { |x| x.downcase }
Ruby protip incoming:
%w(FOO BAR BAZ).map(&:downcase) # => ["foo", "bar", "baz]
http://stackoverflow.com/questions/1217088/what-does-mapname-mean-in-ruby
nice
@EvilScott I got something to work so that case is ignored
where(params.select { |param| fields.any? { |s| s.casecmp(param)==0 } })
However, other params passed in the query e.g., limit
make the query fail with no implicit conversion of Symbol into String
- Is there something obvious I can do to fix
If you just #to_s
on the things you are comparing you should be fine:
s.to_s.casecmp(param.to_s) == 0
ah, thanks, will try
excellent :grinning:
:bullettrain_side:
Looks like some tables, such as
maturity
table, have the Species Code asSpeccode
instead ofSpecCode
.rfishbase
uses a generic function to call endpoints by Species Code, since most tables are keyed by Species Code and this avoids a lot of repetitive code. I could try and handle these cases explicitly, but a general server-side solution might be better. @EvilScott any ideas on how best to handle this?also, interesting that we don't get an error when requesting a non-existent field to query (e.g. asking for
SpecCode
when the field is calledSpeccode
):http://fishbase.ropensci.org/maturity?SpecCode=4
http://fishbase.ropensci.org/maturity?Speccode=4