Open hawkw opened 9 years ago
I'm thinking maybe we'd use a custom date format along the lines of
case class Date(day: Option[Int], month: Option[Int], year: Int)
or similar. We could then sort all the books by year, and then sub-sort each year by month if available, putting all the books for which month is not defined at the end of the year, and then sub-sort each month by day in a similar way.
I'm not sure how many books in OpenLibrary even have days of publication, so maybe we should only care about month and year.
This date format would only be used internally – they'd be converted back to String
s by a custom serialiser like the ones we use for Book
and Author
.
Wait, there's really no existing date or datetime we can use? This sounds like a PHP "feature" where dates are just ints.
There's java.util.Date
, but it's just generally a huge pain to use (probably the single worst component of the Java standard library), and convincing it to parse dates at varying levels of precision is a struggle – it's not very well suited for dealing with publication dates like the ones we get from OpenLibrary that normally tend to just have a month and a year.
A lot of people like the JodaTime library, and there's a nice Scala wrapper for it. However, I feel like the publication dates are so simple that we don't really need a fully featured datetime library to handle them, and I'm not psyched about adding another external dependency to our already pretty big list of libs we depend on.
With that said, I'm considering moving to JodaTime for timestamping loans and stuff, so maybe I'll be including it anyway. If that's the case, I'll probably just use it here too.
I've just been informed that if we restrict ourselves to Java 8+ JREs (which I believe we do anyway), there's a new java.time
package in the standard lib that's basically meant to replace java.util.Date
. Might want to use that; I'll have to look into it.
It looks like all the extant libs (including java.time
) are really rubbish at handling cascading precision. I still feel like the roll-it-ourselves solution might be the best.
There's also the option of saying "we don't care about months and will just truncate all dates to years for sorting purposes" but I feel like that'd be sad.
Made a feature branch for this. Still not sure what the Right Thing to do here is.
For the Borges "list of books can be sorted" feature, we're going to want to be able to sort books by date of publication.
The problem is that the dates we get from OpenLibrary are not date objects/timestamps, but strings. Most of the strings are not accurate enough to be easily parsed to
java.util.Date
s. Also, the strings come in varying precisions – some of them are of the formatMonth, Year
, while others are justYear
.Therefore, I need to figure out some method of converting dates to a more machinable format upon ingestion.