Closed richard-jones closed 7 years ago
At the moment probably depends only on what wellcome expected. Can be changed though.
On 17 Aug 2016 11:25, "Richard Jones" notifications@github.com wrote:
Do we have any certainty over date formats, and can we attempt to normalise
dateOfPublication: "2014 Sep"
It would be useful to have this either as a UTC datestamp, or broken down into year, month, day, etc, so that the dates are easy to re-use in other systems.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCOXOIz5X-YcsI6RHDv-6HzXQLsKCks5qguGWgaJpZM4JmThK .
Presumably the date representation in the API, and how it is presented in the CSV are separated, though, so we can do both?
Yes we can, I just mean so far the only thing that will have influenced what we currently do is however wellcome wanted it to look. If that was the same as how it came from the remote API, then it will just not have been altered.
On 17 Aug 2016 11:32, "Richard Jones" notifications@github.com wrote:
Presumably the date representation in the API, and how it is presented in the CSV are separated, though, so we can do both?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-240374600, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCJtjWSaUj-wEjyFLfAkQrDH1WbVyks5qguMvgaJpZM4JmThK .
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
Ok, can we have this in an ISO format please:
Sun, 01 Jan 2006 00:00:00 GMT
Should be
2006-01-01T00:00:00Z
On 21 August 2016 at 21:57, markmacgillivray notifications@github.com wrote:
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241281631, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkSn-Kax4CQ0odfHnj86GkgOqdCXks5qiLvFgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
Oh, I thought you asked for UTC. Yeah can change it.
On Mon, Aug 22, 2016 at 11:05 AM, Richard Jones notifications@github.com wrote:
Ok, can we have this in an ISO format please:
Sun, 01 Jan 2006 00:00:00 GMT
Should be
2006-01-01T00:00:00Z
On 21 August 2016 at 21:57, markmacgillivray notifications@github.com wrote:
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/ Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241281631, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkSn- Kax4CQ0odfHnj86GkgOqdCXks5qiLvFgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241368404, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCCDW-TACLVk9Fi1bVTQKBMOqiojQks5qiXRlgaJpZM4JmThK .
Oh, I thought you asked for UTC. Yeah can change it.
Yes, UTC is the timezone, that's what the Z at the end means. The format of the timestamp is ISO 8601, which represents UTC times in that way.
On Mon, Aug 22, 2016 at 11:05 AM, Richard Jones notifications@github.com wrote:
Ok, can we have this in an ISO format please:
Sun, 01 Jan 2006 00:00:00 GMT
Should be
2006-01-01T00:00:00Z
On 21 August 2016 at 21:57, markmacgillivray notifications@github.com wrote:
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/ Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241281631, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkSn- Kax4CQ0odfHnj86GkgOqdCXks5qiLvFgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241368404, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCCDW- TACLVk9Fi1bVTQKBMOqiojQks5qiXRlgaJpZM4JmThK
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241370908, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSlmpNyyKKv6p652McgyV5vUv1BWPks5qiXcmgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
Yeah, but the default UTC string is the wordy one, isn't it? Not that it matters, just interest.
On Mon, Aug 22, 2016 at 11:23 AM, Richard Jones notifications@github.com wrote:
Oh, I thought you asked for UTC. Yeah can change it.
Yes, UTC is the timezone, that's what the Z at the end means. The format of the timestamp is ISO 8601, which represents UTC times in that way.
On Mon, Aug 22, 2016 at 11:05 AM, Richard Jones < notifications@github.com> wrote:
Ok, can we have this in an ISO format please:
Sun, 01 Jan 2006 00:00:00 GMT
Should be
2006-01-01T00:00:00Z
On 21 August 2016 at 21:57, markmacgillivray <notifications@github.com
wrote:
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/ Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241281631, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkSn- Kax4CQ0odfHnj86GkgOqdCXks5qiLvFgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241368404, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCCDW- TACLVk9Fi1bVTQKBMOqiojQks5qiXRlgaJpZM4JmThK
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241370908, or mute the thread https://github.com/notifications/unsubscribe-auth/ AA0QSlmpNyyKKv6p652McgyV5vUv1BWPks5qiXcmgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241372347, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCN5qVnbx2fueGFaqah876NeIVU8Eks5qiXi4gaJpZM4JmThK .
On 22 August 2016 at 11:29, markmacgillivray notifications@github.com wrote:
Yeah, but the default UTC string is the wordy one, isn't it? Not that it matters, just interest.
Ah, I don't know, I guess that depends on the library you ask to format it. But yes, if we can keep the stamp in UTC and express it as a standard ISO format, that would be good.
Cheers,
Richard
On Mon, Aug 22, 2016 at 11:23 AM, Richard Jones notifications@github.com
wrote:
Oh, I thought you asked for UTC. Yeah can change it.
Yes, UTC is the timezone, that's what the Z at the end means. The format of the timestamp is ISO 8601, which represents UTC times in that way.
On Mon, Aug 22, 2016 at 11:05 AM, Richard Jones < notifications@github.com> wrote:
Ok, can we have this in an ISO format please:
Sun, 01 Jan 2006 00:00:00 GMT
Should be
2006-01-01T00:00:00Z
On 21 August 2016 at 21:57, markmacgillivray < notifications@github.com
wrote:
dateofpublication now standardises to utc date where possible. See example:
https://dev.api.cottagelabs.com/service/lantern/ Arra8ccNed986NWp5/results
Note that this one only had a date of "2006 " so the rest is assumed to the start of the year.
This is on dev, ready for live after confirmation.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241281631, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkSn- Kax4CQ0odfHnj86GkgOqdCXks5qiLvFgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241368404, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCCDW- TACLVk9Fi1bVTQKBMOqiojQks5qiXRlgaJpZM4JmThK
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241370908, or mute the thread https://github.com/notifications/unsubscribe-auth/ AA0QSlmpNyyKKv6p652McgyV5vUv1BWPks5qiXcmgaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/ 108#issuecomment-241372347, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAuXCN5qVnbx2fueGFaqah876NeIVU8Eks5qiXi4gaJpZM4JmThK
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241373498, or mute the thread https://github.com/notifications/unsubscribe-auth/AA0QSkFmclae8grqmLHLqq4PKPfD_5Wjks5qiXn1gaJpZM4JmThK .
Richard Jones, Founder, Cottage Labs https://cottagelabs.com || @cottagelabs
Lantern: https://lantern.cottagelabs.com Repository Solutions: https://cottagelabs.com/repository
I notice too that some records have an electronicPublicationDate field, but this is not in the journal portion of the result object. Can we normalise them?
It is that way on purpose because in epmc data it is not in the journal object. Therefore, it seems the dateofpublication is a feature of the journal itself, but electronicpublicationdate may not be - it could be published electronically somewhere other than the journal site, presumably. It could be moved into our journal portion of the result object, but could this potentially mislead people as to what it is? It is possible that it IS the journal publication date, but then it seems odd that epmc put it outside that area themselves, given that the other date we grab is inside that area.
On Mon, Aug 22, 2016 at 7:51 PM, Richard Jones notifications@github.com wrote:
I notice too that some records have an electronicPublicationDate field, but this is not in the journal portion of the result object. Can we normalise them?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CottageLabs/LanternPM/issues/108#issuecomment-241512269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuXCApglZ1faIIcLdR8uoJ4hrkfdFssks5qie_CgaJpZM4JmThK .
@richard-jones see my previous comment on this issue, do you want that date moved into the journal part of the result or not, given that it may not actually be a date related to the journal?
For now have formatted dates but not moved the electronic date. Moving to live soon, will update again
On live @richard-jones notifying you in case you want to check docs.
In terms of structuring the output data, I'd have thought it made more sense that the dates not be in the journal section. I know the publication date of the article could be the same as the publication date of an issue of a journal (for journals that still work that way), but since the results are focussed around an article, it makes sense to attach as much detail to the article "node" as possible.
I wonder if it might be worth us overall reviewing the output data model for use as an API - at the moment it's very much a dump of the data you would use to build the UI/CSVs, but it's fairly difficult to interpret from a developer's point of view, and some compliance results have to be implied from the data. This would also, of course, be helped by documentation, and when we get a bit more breathing space I'll pin you down and quiz you about it!
It may make more sense for dates not to be in the journal section, but then again we would be making an assumption about the data we are receiving, and changing how it may appear to a consumer of our API. Right now they are where they are in our API because of where they are found in the remote services we use. It may make more sense for a publication date to be about an article rather than a journal, but if we find that date in the journal section of the data from EPMC, then that is what that date is about (or EPMC put it in the wrong place).
It probably is worth a review, and as previously mentioned, there has been no actual API requirement yet beyond building UIs and CSVs, so no real user requirement to do anything in a particularly different way. In terms of implying compliance results from the data - yes, I think that will always be the case, because compliance results ARE only implications, and depend on the criteria of the end user. For Wellcome, they have such a criteria, so their CSV output determines compliance results by their criteria.
Whilst pinning me down and quizzing me may be of some help, it would be better to find a user and quiz them about it. That is what we have been doing with Cecy to date, and the opinion of a user we are building the service for is more useful than our assumption of how we think it should be.
The simplest solution to where things should go would be to just make it all flat. But then there are things like journal title which would clash with article title, unless we call it something like journal_title, which is just implying hierarchy in a different way. So back to the publication date question, we would have the same problem - WHAT sort of publication date is it that we are actually presenting? Given that the data does come from external services, are we really better to abstract it even further away from that representation, or to leave it close to as is, bar changes required for merging information from multiple services?
If we are going to make firm statements about compliance in our results, rather than having it be an implication, then we need a definition of compliance. Making our own definition may not tie up with the definitions in use by one/multiple users, so unless there is a clear reason for creating such a definition, what would be the benefit of arbitrarily creating one?
Monitor is long finished - can this be closed @richard-jones ?
I think we still want the standardised date representations for our own results, but yes no longer relevant to Monitor
Whenever we normalise a date from an external source, record the fact that we changed it and how we changed it, in our provenance info.
Use the format 2014-01-01
Where only year and month are known, then only do 2014-01
Data has been flattened, and dates in json are ISO dates. They all normalise down to earliest month and earliest day if one is not provided, so something claimed to be published in 2011 will show as 2011-01-01. UI displays this as 01/01/2011
Do we have any certainty over date formats, and can we attempt to normalise
It would be useful to have this either as a UTC datestamp, or broken down into year, month, day, etc, so that the dates are easy to re-use in other systems.