codeforamerica / ohana-api

The open source API directory of community social services.
http://ohana-api-demo.herokuapp.com/api
BSD 3-Clause "New" or "Revised" License
185 stars 344 forks source link

Short Descriptions Longer than 200 Characters #137

Closed sunnyrjuneja closed 10 years ago

sunnyrjuneja commented 10 years ago

Hi Folks,

About 10% of the records we scraped have a short description greater than 200 characters. What do you think is a sensible default on how we should handle these records? The distribution of the length of short description is as follows:

Max: 375 200 to 250: 117 251 to 300: 87 301 to 351: 7 351 to 400: 3

Is truncating the right way to go? Should we just remove the constraint? For more context, check out https://github.com/openoakland/ohana-api/issues/3.

siruguri commented 10 years ago

@whatasunnyday - does the AC data set have both a short and a long description?

If so, I think it'll be good to reflect those semantics in the data structure, and not truncate the short description further into a shorter description which is a semantically lossy strategy. I would argue that the data structure should be aligned towards user needs rather than design constraints; how "too long" short descriptions are appropriately visualized should be the domain of the client app.

sunnyrjuneja commented 10 years ago

It does. On Mar 7, 2014 9:53 AM, "Sameer S" notifications@github.com wrote:

@whatasunnyday https://github.com/whatasunnyday - does the AC data set have both a short and a long description?

If so, I think it'll be good to reflect those semantics in the data structure, and not truncate the short description further into a shorter description which is a semantically lossy strategy. I would argue that the data structure should be aligned towards user needs rather than design constraints; how "too long" short descriptions are appropriately visualized should be the domain of the client app.

Reply to this email directly or view it on GitHubhttps://github.com/codeforamerica/ohana-api/issues/137#issuecomment-37049301 .

anselmbradford commented 10 years ago

I'd be open to expanding the short description to 255 characters. That's a common varchar db field length maximum. What say you @monfresh @spara

monfresh commented 10 years ago

The character limit was not set for any technical reasons. It's for editorial purposes. We'd like to encourage service providers to provide a succinct overview of the service(s) they provide. If the "short description" keeps going on, it no longer remains short. That's why we also have a separate field for the full-length description. We just have to decide what a reasonable limit is for what's meant to be a summary of the services.

If a dataset, like Alameda's, contains some entries with short descriptions that are longer than the limit we agree on, then they should actually be re-written, not just truncated.


For context, the text above the separator is 181-characters long, and already feels like more than a summary, so I think going beyond 200 characters is pushing it. Thoughts?

monfresh commented 10 years ago

Scratch that last comment about length. I miscounted! It's only the second paragraph that has 181 characters. The first paragraph has 439 characters, and the first 4 sentences have 263 characters, so somewhere in between sounds good to me.

migurski commented 10 years ago

I’d leave character limits out of the application myself. Feels weird to attempt to impose an aesthetic judgement here.

siruguri commented 10 years ago

+1 for comment by @migurski ... the salient attribute appears to be the long description. The short description is a client decision and baking it into the data structure doesn't seem to me to add value to the API.

On Fri, Mar 14, 2014 at 9:02 AM, migurski notifications@github.com wrote:

I'd leave character limits out of the application myself. Feels weird to attempt to impose an aesthetic judgement here.

Reply to this email directly or view it on GitHubhttps://github.com/codeforamerica/ohana-api/issues/137#issuecomment-37664528 .

monfresh commented 10 years ago

That actually makes sense. I'll open in issue in OpenReferral as well, where short description is required.

anselmbradford commented 10 years ago

Short description being optional in the spec seems fine from my perspective. I do think it's a useful data to have though, and I'd like to see guidance in the spec or in ohana-web-search or elsewhere on how a client would include or extrapolate a summary of services of some sort as it's important to the usability of the data.

For instance, take the entry below from smc-connect's search results that has a short description on the last line. Without the summary of services, it would be hard to tell what the organization actually does:

screen shot 2014-03-14 at 1 35 56 pm

Now that's just one client example, but it also illustrates a bare minimum of data being pulled from the API to summarize an organization. A full description would not work in place of a short description here (e.g. a truncated full description that grammatically doesn't make sense might as well not be there). So though it's the client's choice how the summary is included and displayed, it's a problem if it's difficult to create a service summary or something analogous in a predictable size from the API. Bottom line: make it optional, but document at some layer a recommended method of pulling a service summary out of the API.

monfresh commented 10 years ago

Keep in mind that the reason why SMC-Connect has helpful summaries that can be displayed on the front end is not thanks to any special API feature, but because those descriptions were written by humans. For the most part, they are one to two sentences long, with the vast majority being under 200 characters. This length fits nicely with the front end as currently designed.

However, note that even if you bump up the short description to 300 characters, it still looks decent:

Search results: screen shot 2014-03-14 at 3 58 55 pm

Details: screen shot 2014-03-14 at 3 57 43 pm

We can't assume that other cities have the same level of data quality, so the appropriate recommendation to make is to write helpful and succinct descriptions if the client wishes to display them using the default design, or to modify the front end to suit their needs.

anselmbradford commented 10 years ago

Yeah, makes sense. In a certain light, the character limit was a special feature of the API—that while undesirable as pointed out in this thread—did force a summarizing of the data that's entered, which is the whole purpose of the field. This can still be imposed by the admin tool, but not the API. My point though is having a summary of services in a bounded data size for an organization is a valuable data point to be able to retrieve from the API, so should be retained in some form. Perhaps that doesn't look like a short description (which really does have importation and syncing issues with the longer description), but could be the service category keywords (possibly, though these have their own issues) or a more formulaic summary.

(If someone wanted to get really crazy with it, there's http://en.wikipedia.org/wiki/Automatic_summarization)