Normalize titles? - Githubissues

alexandrialibrary / Alexandria

Alexandria is a simple little card catalogue webapp with a terribly pretentious name.

MIT License

2 stars 1 forks source link

Normalize titles? #47

Closed hawkw closed 9 years ago

hawkw commented 9 years ago

In addition to having the backend automatically perform corrections on bylines (#46), might we also want to do corrections on title strings? Titles are currently coming in from OpenLibrary with only the first letter of the first word capitalised, but we might want to capitalise the rest of the words as well. While we can do this on the front-end using class="text-capitalize" (thanks Bootstrap!), this doesn't handle words like "of" and "the" that shouldn't be up-cased when occurring in the middle of titles.

While we could handle this in the frontend using JavaScript, I think that in the long term it would probably be more efficient to do this on book ingestion on the backend, because that way we'd only ever have to correct the string once and then the corrected title would be in the DB permanently.

redbassett commented 9 years ago

We should figure out how to get correct capitalization from the source, not do it ourselves. An author might have a prefered stylization of a title, and we should use that.

hawkw commented 9 years ago

We should figure out how to get correct capitalization from the source, not do it ourselves.

I agree that would be the Right Thing, but OpenLibrary doesn't appear to support this. All the books from OpenLibrary appear to be coming with just the first letter capitalised, and their API doesn't offer any options to control that.

hawkw commented 9 years ago

Obviously, as a localisation issue, this will be a humongous pile of suffering for us to do Correctly; my friend Catherine is a French major and she's always telling me about the deeply confusing French rules for title capitalisation.

hawkw commented 9 years ago

OpenLibrary doesn't appear to support this

Never mind. I checked out the OpenLibrary API, and it looks like their API requests are returning the book's correct stylisation. That just seems strange to me, because I've definitely seen some English language titles from OpenLibrary that don't seem to follow what I'd consider to be the correct capitalisation rules.

I guess this is rather significantly less of a problem than I thought it was! I'll just have to make sure that title ordering in the backend is case-insensitive and this will be done.

redbassett commented 9 years ago

Chance they have wrong titles for some books?

hawkw commented 9 years ago

Chance they have wrong titles for some books?

It's possible. But it's also possible that I'm just wrong about this.