codeforamerica / trailsyserver

API and admin UI server for Trailsy data
7 stars 15 forks source link

Structure API for extracting alternate trailsegment structure #2

Closed smathermather closed 10 years ago

smathermather commented 11 years ago

Regarding adding a component for data structured for the brigade app, thinking along these lines:

http://trailsyserver-prod.herokuapp.com/trailsegments.json?format=alt

or similar. This would point to an alternate json endpoint something like this:

json.extract! @trailsegment, :length, :source, :steward, :geom, :name1, :name2, :name3, :horses, :dogs, :bikes, :created_at, :updated_at

That would be predicated on a database view that joins :horses, :dogs, :bikes etc. back to trailsegments where it's not already populated at the segment level.

danavery commented 11 years ago

In that last segment, do you mean "back to trails" instead of trailsegments?

So the idea is to return the trail-level use information for segment-level requests if there's no segment-level data for each attribute, correct? If that's going to be useful for the brigade app, we (including you in that "we" :) can certainly get that done. In theory, it could be the same endpoint and just return whichever use fields are available (at whichever level they're available) for every segment request. I suppose it would depend on whether it would be important to know whether you were getting segment-level or trail-level info. If it's not, the same endpoint might work just fine, and it's just a matter of adding the per-field logic.

A few quick notes: 1) I've been using RGeo::GeoJSON instead of Jbuilder to produce the JSON responses, because it seemed quicker on the first attempt to produce features and feature collections and the like. You can see the implementation in the trailhead and segment controllers. 2) In the interests of keeping the input and editing processes simple, the table structure is not as clean as DB purists like myself might like. To find a trail based on a segment, one would have to match up both the particular trail name (trail[x]) and the source field with the name and source fields in the trail metadata table. The DB structure makes some of the Rails syntactic sugar, specifically around joins, sadly unavailable for now. Hopefully the lookup won't be too much of a performance hit, at least before spending any energy around caching just yet. 3) I'm syncing up the DB field names with the standard now, so "horses" and "bikes" will of course be changing soon.

smathermather commented 11 years ago

In that last segment, do you mean "back to trails" instead of trailsegments?

What I mean is that we have an API endpoint that returns the highest granularity of data available, and fills in where needed. So if use types are available at trailsegment level, return those, otherwise crosswalk with trail to get use types at the segment level.

  1. Cool, I'll look at that. So this will replace json.extract for trails and trailsegments?
  2. I'm not as much of a DB purist as I'd like to think, so no complaints here, so long as ends can be met.
  3. Cool.
smathermather commented 11 years ago
  1. Ah, got it, e.g. https://github.com/danavery/trailsyserver/blob/d2226617feeeeb2edf9bf1bf40861be95c50ecb4/app/models/trailsegment.rb
danavery commented 11 years ago

Cool. That's exactly what I thought you meant. Return the trail use fields in place of the segment use fields if they're not available in the segment.

The catch is that segments can be part of more than one trail, and those trails could have trail-level overall use values that differ. I can't imagine that happening too much, but it could happen. So when looking up a particular segment that doesn't have use data, it's not clear which trail's fields to use. The most-restrictive, least-restrictive, or even just the first trail listed are all options. My first instinct is to return "N" if any of the associated trails has a "N" in that field, but we could definitely consider other methods.

smathermather commented 11 years ago

Mmm, good point on the catch. One to many... . Both errors of inclusion and exclusion are problematic. Any tangible examples?

smathermather commented 11 years ago

I suspect this is a lossy data problem in the end-- there's no way to reconstruct the segment level data in any meaningful way.

Tangible example. The bridle trail between Bedford and Brecksville Reservations shares portions of the Towpath, and diverges from the Towpath where possible, weaving in and out. e.g.: cvnp_bridle_trail and mix of trails

(more to be seen here: http://maps.clevelandmetroparks.com/url/cNc)

This is also Buckeye Trail. If we go with the most restrictive, which is Buckeye Trail, then we exclude bikes and bridle use. If we go with the most permissive... meh. I can't think this one through at this hour.

Thoughts?

danavery commented 11 years ago

Yea, that's not a ideal case, for sure.

The problem isn't as much with the code (in its current or future forms) as it is with the data we've been working with. Since we've just now mostly locked down the use/access fields at both the trail and segment levels, and we're not using the per-segment use/access fields in Trailsy at the moment, we've only asked for and received per-trail attributes.

We can (and will) store and serve up whatever segment-level data we receive, but last I was aware, it's not really available on the segment level from everyone involved. Also, given that we've been asking for a lot of reformatted data in a fairly short period of time, it may not be practical to require segment-level data in the next couple of weeks, assuming it even exists everywhere. That's not to say it couldn't be made available and imported into the app sometime soon. If high-accuracy segment-level access/use data for every segment is critical for a Brigade app, that might be a great place to start working directly with the Summit County partners once the dust has settled.

But at first glance it would seem that any app should probably have some fallback behavior for agencies that may be providing multiple segments for a trail, but may not necessarily be willing or able to vouch for the accessibility/use of every segment individually. For example, it's still unclear how detailed some of the Summit County townships' data is going to be if and when they start submitting data.

Let me know if that makes any sense or if you think I'm off-base here.

smathermather commented 10 years ago

The problem with the idea of pushing this problem to the app level is that sometimes there is no solution (which we've established earlier in this thread)-- what do we do when we want to use e.g. the towpath in OSM which inherently uses segment level info, or an application using a routing algorithm which pays attention to segment level access (brigade app)? There is no appropriate fallback logic.

smathermather commented 10 years ago

IMNNHEO, should segment level use-type be a real barrier to participation, rather than adjusting the trail standard, we should be engaging the leadership to offer expertise in helping these communities, e.g. tapping CVNP or similar to help with the initial pass of data. That's why we share resources across communities, and collaborate on projects.

danavery commented 10 years ago

Agreed! If we had more time left in the fellowship, we might very well be working on that sort of coordination. As it is, we can definitely help start those conversations (or plant the seeds of those talks) with the communities that have shown interest in participating before we wrap things up, and the Brigade project can continue them as it progresses.

smathermather commented 10 years ago

Cool. If you initiate that conversation, in addition to whatever the Brigade commits, I can commit some technical help in the short and long term. CM maintains an 8-county open space and trails inventory, so we have a vested institutional interest in this.

smathermather commented 10 years ago

So, I am wondering then if there is value to creating a separate API in advance of data that doesn't exist. Not trying to shirk coding duties here, but this feels like something that could be bumped. Thoughts?

danavery commented 10 years ago

As a start, I've added the access/use fields to the segment JSON. For segments where we already have the data for a particular field, it's there in the returned value. If we don't have the data, the value is "null". Later imports of segment data could certainly fill in the blanks.

http://trailsyserver-dev.herokuapp.com/trailsegments/612.json for an example.

smathermather commented 10 years ago

Cool. BTW, the example gives me a 404.

So, I think, at least for Bedford and Brecksville, Anthony will be including segment level data (if he hasn't sent already). He may include it for the rest of CUVA as well.

smathermather commented 10 years ago

Cool, looked at http://trailsyserver-dev.herokuapp.com/trailsegments/1.json and this seems to do the trick. So this will return segment level if available?

danavery commented 10 years ago

Exactly. It can be part of the uploaded data, or edited in the admin UI.

smathermather commented 10 years ago

Perfect. Thanks Dan.