Open cambro opened 4 years ago
Bumping this one. It would be incredibly useful in COVID19. New pubs are needed first!
Bumping this one again!
I'll implement these into the ElasticSearch schema and expose them via the API.
The main change needed for this is that the calls to the xdd API need to happen at ingestion time, so we can store them in the parquet objects.
I'll probably add metadata information to the PDFs parquet, which can then be loaded into Elastic accordingly.
It would be very useful to be able to sort the response, currently returned in order of a combination of "confidence" and query matching, by other metadata. Big one would be publication date. It will be common for scientists to want to see the latest results first. Secondary filtering would be by journal. Publisher filtering a convenience for communication with publishers (mostly, though not exclusively, could be useful for science too).