Structure API for extracting alternate trailsegment structure

smathermather commented 11 years ago

Regarding adding a component for data structured for the brigade app, thinking along these lines:

http://trailsyserver-prod.herokuapp.com/trailsegments.json?format=alt

or similar. This would point to an alternate json endpoint something like this:

json.extract! @trailsegment, :length, :source, :steward, :geom, :name1, :name2, :name3, :horses, :dogs, :bikes, :created_at, :updated_at

That would be predicated on a database view that joins :horses, :dogs, :bikes etc. back to trailsegments where it's not already populated at the segment level.

danavery commented 11 years ago

In that last segment, do you mean "back to trails" instead of trailsegments?

So the idea is to return the trail-level use information for segment-level requests if there's no segment-level data for each attribute, correct? If that's going to be useful for the brigade app, we (including you in that "we" :) can certainly get that done. In theory, it could be the same endpoint and just return whichever use fields are available (at whichever level they're available) for every segment request. I suppose it would depend on whether it would be important to know whether you were getting segment-level or trail-level info. If it's not, the same endpoint might work just fine, and it's just a matter of adding the per-field logic.

A few quick notes: 1) I've been using RGeo::GeoJSON instead of Jbuilder to produce the JSON responses, because it seemed quicker on the first attempt to produce features and feature collections and the like. You can see the implementation in the trailhead and segment controllers. 2) In the interests of keeping the input and editing processes simple, the table structure is not as clean as DB purists like myself might like. To find a trail based on a segment, one would have to match up both the particular trail name (trail[x]) and the source field with the name and source fields in the trail metadata table. The DB structure makes some of the Rails syntactic sugar, specifically around joins, sadly unavailable for now. Hopefully the lookup won't be too much of a performance hit, at least before spending any energy around caching just yet. 3) I'm syncing up the DB field names with the standard now, so "horses" and "bikes" will of course be changing soon.

smathermather commented 11 years ago

In that last segment, do you mean "back to trails" instead of trailsegments?

What I mean is that we have an API endpoint that returns the highest granularity of data available, and fills in where needed. So if use types are available at trailsegment level, return those, otherwise crosswalk with trail to get use types at the segment level.

Cool, I'll look at that. So this will replace json.extract for trails and trailsegments?
I'm not as much of a DB purist as I'd like to think, so no complaints here, so long as ends can be met.
Cool.

smathermather commented 11 years ago

Ah, got it, e.g. https://github.com/danavery/trailsyserver/blob/d2226617feeeeb2edf9bf1bf40861be95c50ecb4/app/models/trailsegment.rb

danavery commented 11 years ago

Cool. That's exactly what I thought you meant. Return the trail use fields in place of the segment use fields if they're not available in the segment.

The catch is that segments can be part of more than one trail, and those trails could have trail-level overall use values that differ. I can't imagine that happening too much, but it could happen. So when looking up a particular segment that doesn't have use data, it's not clear which trail's fields to use. The most-restrictive, least-restrictive, or even just the first trail listed are all options. My first instinct is to return "N" if any of the associated trails has a "N" in that field, but we could definitely consider other methods.

smathermather commented 11 years ago

Mmm, good point on the catch. One to many... . Both errors of inclusion and exclusion are problematic. Any tangible examples?

smathermather commented 11 years ago

I suspect this is a lossy data problem in the end-- there's no way to reconstruct the segment level data in any meaningful way.

Tangible example. The bridle trail between Bedford and Brecksville Reservations shares portions of the Towpath, and diverges from the Towpath where possible, weaving in and out. e.g.: cvnp_bridle_trail and mix of trails

(more to be seen here: http://maps.clevelandmetroparks.com/url/cNc)

This is also Buckeye Trail. If we go with the most restrictive, which is Buckeye Trail, then we exclude bikes and bridle use. If we go with the most permissive... meh. I can't think this one through at this hour.

Thoughts?

danavery commented 11 years ago

Yea, that's not a ideal case, for sure.

The problem isn't as much with the code (in its current or future forms) as it is with the data we've been working with. Since we've just now mostly locked down the use/access fields at both the trail and segment levels, and we're not using the per-segment use/access fields in Trailsy at the moment, we've only asked for and received per-trail attributes.

We can (and will) store and serve up whatever segment-level data we receive, but last I was aware, it's not really available on the segment level from everyone involved. Also, given that we've been asking for a lot of reformatted data in a fairly short period of time, it may not be practical to require segment-level data in the next couple of weeks, assuming it even exists everywhere. That's not to say it couldn't be made available and imported into the app sometime soon. If high-accuracy segment-level access/use data for every segment is critical for a Brigade app, that might be a great place to start working directly with the Summit County partners once the dust has settled.

But at first glance it would seem that any app should probably have some fallback behavior for agencies that may be providing multiple segments for a trail, but may not necessarily be willing or able to vouch for the accessibility/use of every segment individually. For example, it's still unclear how detailed some of the Summit County townships' data is going to be if and when they start submitting data.

Let me know if that makes any sense or if you think I'm off-base here.