Open hrbrmstr opened 7 years ago
Sounds great! I could contribute some domain knowledge on this (albeit a little light on the fisheries related issues). I wonder if a major outcome of this effort could be a detailed description of the development process so that people could write packages for the many similar database interfaces for other areas (I maintain a list with some at: https://jsta.github.io/limnology_models_data/). I am thinking less of "use this package" and more of "here's how we found the api endpoint + parameters" and "here's how you know that selenium is required".
IMO that would be a superb resource for folks (that's a nice list of other databases, too).
On Tue, Apr 18, 2017 at 10:37 AM, Joseph Stachelek <notifications@github.com
wrote:
Sounds great! I could contribute some domain knowledge on this (albeit a little light on the fisheries related issues). I wonder if a major outcome of this effort could be a detailed description of the development process so that people could write packages for the many similar database interfaces for other areas (I maintain a list with some at: https://jsta.github.io/limnology_models_data/). I am thinking less of "use this package" and more of "here's how we found the api endpoint + parameters" and "here's how you know that selenium is required".
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/42#issuecomment-294865043, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHthP41PvdRcu3WNZ4BD580nc80vwHks5rxMqRgaJpZM4NAVoA .
While I won't be there, I was planning on blocking off the 25th and 26th so that I can follow along remotely! Be very interested in what you come up with here and happy to contribute.
And nice list, @jsta! One thing that is becoming apparent (at least to me) is that a harmonized lakes database (at least for US, but also Canada) would be great. There are many folks working in similar directions (EPA, USGS, you and Patricia and others...). Lot of really cool things could happen if a National (North American) lakes database would come to pass. But I digress...
@jhollist We'd love to have you there remotely. I'm guessing you're already on our Slack, and hopefully the team that you join can also have you connected by voice/video for at least part of it. You might ping Nick Tierney/Miles McBain to see how they pulled it off last year as part of Bob's team.
Thanks @karthik! I will ping them and work with @hrbrmstr, @jsta, or others (interested in a lot of the issues e.g. #5 ) on best way to get looped in. One of these years I'll throw my hat in the ring to hopefully attend in person!
One of these years I'll throw my hat in the ring to hopefully attend in person!
You should and we'd be delighted to have you in person!
@jhollist Let me know if there's anything I can do to help you work remotely. As rOpenSci's community manager my unconf role will be 100% facilitation.
Nick Tierney said his main barrier was just Australian time zone. Group had meetings as needed via https://appear.in.
@stefaniebutland Thanks! My plan at this point is to follow along via slack (although I need to track down my 2fa codes, b/c my authenticator isn't working ...) and GitHub. Thanks for the link to appear.in. That will be useful. If I have any other issues, I will let you know.
I will also not be there, but I am most curious about this particular issue. Not because of the JSON, but because the dataset is interesting and I'm trying to learn new things. If this one goes forward or not, I'd like to try and participate in it as well. @stefaniebutland Is there an runconf17 Slack channel I need to join? I'm in General and Random thanks to @sckott
@hrbrmstr and @jsta If this gets any traction on Thursday, do hit me up on slack or twitter. I'll be following along 11-4:30 EDT and can hope on appear.in if a chat makes sense. Like @bhive01 I am interested in helping and especially so with anything lake related!
Will do. I'm super interested to see how Thu will go :-)
On Tue, May 23, 2017 at 7:32 PM, Jeffrey W Hollister < notifications@github.com> wrote:
@hrbrmstr https://github.com/hrbrmstr and @jsta https://github.com/jsta If this gets any traction on Thursday, do hit me up on slack or twitter. I'll be following along 11-4:30 EDT and can hope on appear.in if a chat makes sense. Like @bhive01 https://github.com/bhive01 I am interested in helping and especially so with anything lake related!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/42#issuecomment-303566783, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHttPF98euxTWT4ZZLPP5hH9gJLfQeks5r82yWgaJpZM4NAVoA .
I took a look at the structure of the query results. You were not kidding about the nestedness. It makes sense to me to return results for a single lake as a single list of data frames. For example, a query like lakefinder_get(lake = "56011602")
would return a list object with the following structure:
|__characteristics
|__name
|__id
|__max_depth
|__...
|__surveys
|__id
|__date
|__quartile
|__cpue
|__species
|__length
|__...
It is not clear to me without further digging which columns in the survey
object represent derived quantities versus unique data. For example, it seems that maximum_length
and minimum_length
are derived from fishCount
. Is quartileCount
also derived from fishCount
? It seems like quartileWeight
is unique (not derived) as there is no fishWeight
column.
http://www.dnr.state.mn.us/lakefind/index.html
There have been a few SO questions (btw: I don't think that search result is comprehensive but it's indicative) that need to get to the underlying, heavily nested JSON result.
Might be worth a pkg attempt. I'm not smart enough in the underlying data to know what to do on my own with it (I'd be making too many assumptions and not making the right connections/labels).