alexandrialibrary / Alexandria

Alexandria is a simple little card catalogue webapp with a terribly pretentious name.
MIT License
2 stars 1 forks source link

Make dates of publication machine-readable #48

Open hawkw opened 9 years ago

hawkw commented 9 years ago

For the Borges "list of books can be sorted" feature, we're going to want to be able to sort books by date of publication.

The problem is that the dates we get from OpenLibrary are not date objects/timestamps, but strings. Most of the strings are not accurate enough to be easily parsed to java.util.Dates. Also, the strings come in varying precisions – some of them are of the format Month, Year, while others are just Year.

Therefore, I need to figure out some method of converting dates to a more machinable format upon ingestion.

hawkw commented 9 years ago

I'm thinking maybe we'd use a custom date format along the lines of

case class Date(day: Option[Int], month: Option[Int], year: Int)

or similar. We could then sort all the books by year, and then sub-sort each year by month if available, putting all the books for which month is not defined at the end of the year, and then sub-sort each month by day in a similar way.

I'm not sure how many books in OpenLibrary even have days of publication, so maybe we should only care about month and year.

This date format would only be used internally – they'd be converted back to Strings by a custom serialiser like the ones we use for Book and Author.

redbassett commented 9 years ago

Wait, there's really no existing date or datetime we can use? This sounds like a PHP "feature" where dates are just ints.

hawkw commented 9 years ago

There's java.util.Date, but it's just generally a huge pain to use (probably the single worst component of the Java standard library), and convincing it to parse dates at varying levels of precision is a struggle – it's not very well suited for dealing with publication dates like the ones we get from OpenLibrary that normally tend to just have a month and a year.

A lot of people like the JodaTime library, and there's a nice Scala wrapper for it. However, I feel like the publication dates are so simple that we don't really need a fully featured datetime library to handle them, and I'm not psyched about adding another external dependency to our already pretty big list of libs we depend on.

With that said, I'm considering moving to JodaTime for timestamping loans and stuff, so maybe I'll be including it anyway. If that's the case, I'll probably just use it here too.

hawkw commented 9 years ago

I've just been informed that if we restrict ourselves to Java 8+ JREs (which I believe we do anyway), there's a new java.time package in the standard lib that's basically meant to replace java.util.Date. Might want to use that; I'll have to look into it.

hawkw commented 9 years ago

It looks like all the extant libs (including java.time) are really rubbish at handling cascading precision. I still feel like the roll-it-ourselves solution might be the best.

hawkw commented 9 years ago

There's also the option of saying "we don't care about months and will just truncate all dates to years for sorting purposes" but I feel like that'd be sad.

hawkw commented 9 years ago

Made a feature branch for this. Still not sure what the Right Thing to do here is.