16 was actually two bugs in a trenchcoat. First the simplify datetime function wasn't handling a null-like value (specifically, {'date-parts': [[None]]}), and then it choked on how the journal didn't have a 'published' field. I imagine we'll need to keep doing a bunch of massaging to handle incomplete metadata. I am a little hesitant to just mark all timestamp columns as "optional" because we do need one of them, but ideally we'll make a routine to get one of those and store them in a required timestamp column.
Anyway
two commits with correctly failing tests to catch both of those
add a tuple of types that we'll consider "papers"
filter things that aren't in that tuple when we clean results pages, currently i'm doing this silently but we could also log that if we want. the query is specifically for paper-like things, so dropping those should be expected.
add tests for these errors.
i'll leave this open until tmrw if people have feelings about this and then merge :)
Fix: https://github.com/sneakers-the-rat/journal-rss/issues/16
16 was actually two bugs in a trenchcoat. First the simplify datetime function wasn't handling a null-like value (specifically,
{'date-parts': [[None]]}
), and then it choked on how the journal didn't have a'published'
field. I imagine we'll need to keep doing a bunch of massaging to handle incomplete metadata. I am a little hesitant to just mark all timestamp columns as "optional" because we do need one of them, but ideally we'll make a routine to get one of those and store them in a requiredtimestamp
column.Anyway
i'll leave this open until tmrw if people have feelings about this and then merge :)