OpenTreeOfLife / opentree

Opentree browsing and curation web site. For overarching or cross-repo concerns, please see the 'germinator' repo.
http://tree.opentreeoflife.org/
BSD 2-Clause "Simplified" License
108 stars 26 forks source link

Add JSON file (alongside statistics) tracking OT versions by date #615

Closed jimallman closed 8 years ago

jimallman commented 9 years ago

This file (ot-versions-by-date.json) should be placed alongside the phylesystem.json and synthesis.json stats files in opentree/webapp/static/statistics/ on the API server. For now, it's very simple:

[
    {"version": "ott2.8", "date": "2014-06-11T20Z"},
    {"version": "ott2.7", "date": "2014-05-08T20Z"},
    {"version": "ott2.6", "date": "2014-04-11T20Z"},
    {"version": "ott2.5", "date": "2014-03-28T20Z"},
    {"version": "ott2.4", "date": "2014-03-17T20Z"}
    {"version": "ott2.3", "date": "2013-12-04T20Z"}
]

These dates are from the reference-taxonomy README and apparently represent the date when each version was built. In our stats timeline (on the "progress" page), we'll want to highlight when a new taxonomy version is actually applied in synthesis. (We can tie these together using information provided elsewhere.)

jar398 commented 9 years ago

I wonder if this should go in the germinator repo instead? Somewhere near where Peter will be putting the same information for synthesis?

I think we should drop the T20 business. Earlier I thought this wouldn't matter since the extra hour of resolution might be helpful and the strings don't leak, but now I'm pretty sure they will leak and will confuse anyone who sees them. We could provide HH:MM:SS but I think that's kind of silly for statistics-gathering. The processing code should be robust (somehow) to multiple samples occurring in the same day.

On Thu, Feb 19, 2015 at 5:19 PM, Jim Allman notifications@github.com wrote:

This file (ot-versions-by-date.json) should be placed alongside the phylesystem.json and synthesis.json stats files in opentree/webapp/static/statistics/ on the API server. For now, it's very simple:

[ {"version": "ott2.8", "date": "2014-06-11T20Z"}, {"version": "ott2.7", "date": "2014-05-08T20Z"}, {"version": "ott2.6", "date": "2014-04-11T20Z"}, {"version": "ott2.5", "date": "2014-03-28T20Z"}, {"version": "ott2.4", "date": "2014-03-17T20Z"} {"version": "ott2.3", "date": "2013-12-04T20Z"} ]

These dates are from the reference-taxonomy README https://github.com/OpenTreeOfLife/reference-taxonomy/blob/master/doc/README.md and apparently represent the date when each version was built. In our stats timeline (on the "progress" page), we'll want to highlight when a new taxonomy version is actually applied in synthesis. (We can tie these together using information provided elsewhere.)

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/opentree/issues/615.

jar398 commented 9 years ago

If you want more see http://files.opentreeoflife.org/ott/

jimallman commented 9 years ago

I wonder if this should go in the germinator repo instead? Somewhere near where Peter will be putting the same information for synthesis?

Ah, I see that the planned taxonomy report will have this information and much more. @pmidford, would it be helpful to have something like the JSON above as a manually-maintained source for these? Or will dates be reckoned in some other way?

For now, I'll add this file to the germinator statistics/ folder. Feel free to clobber this once we have something more complete!

jimallman commented 9 years ago

I think we should drop the T20 business.

OK by me. I suppose if we end up with multiple events in the same day, we can figure out a convention to handle it. Maybe something like this, if the stats scripts detect a repeating day?

[
    {"version": "ott2.8", "date": "2014-06-11"},
    {"version": "ott2.7", "date": "2014-05-08"},
  "... same day! let's add a decimal suffix...",
    {"version": "ott2.6", "date": "2014-05-08.2"},
  "... or possibly ascending letters?",
    {"version": "ott2.6", "date": "2014-05-08a"}
]

Whatever we decide, should we apply the change for all statistics files?

pmidford commented 9 years ago

The history day shows a number of cases of multiple commits in a single hour, so the hour resolution doesn't resolve the multiple events problem. I don't have an opinion on conventions for identifying multiple events.

pmidford commented 9 years ago

On 2/19/15 9:24 PM, Jim Allman wrote:

I wonder if this should go in the germinator repo instead?
Somewhere near
where Peter will be putting the same information for synthesis?

Ah, I see that the planned taxonomy report https://github.com/OpenTreeOfLife/germinator/wiki/Overview-of-repository-statistics#fields-for-taxonomy-reports-work-in-progress will have this information and much more. @pmidford https://github.com/pmidford, would it be helpful to have something like the JSON above as a manually-maintained source for these? Or will dates be reckoned in some other way?

@jimallman I don't understand how I would use these dates - should I be tagging phylesystem commits immediately preceding synthesis? The current synthesis report isn't using historical data.

For now, I'll add this file to the germinator |statistics/| folder https://github.com/OpenTreeOfLife/germinator/tree/master/statistics. Feel free to clobber this once we have something more complete!

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/opentree/issues/615#issuecomment-75179507.

kcranston commented 9 years ago

On Fri Feb 20 2015 at 7:19:47 AM Peter Midford notifications@github.com wrote:

The history day shows a number of cases of multiple commits in a single hour, so the hour resolution doesn't resolve the multiple events problem. I don't have an opinion on conventions for identifying multiple events.

Let's not overthink the date and time stamps. The point is to show progress over the length of the project (years). If someone wants daily / hourly granularity from phylesystem, they can go to GitHub.

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/opentree/issues/615#issuecomment-75229896 .

jar398 commented 9 years ago

Agreed, if there are more than one in a day, just throw away all but one of them.

jimallman commented 9 years ago

I don't understand how I would use these dates - should I be tagging phylesystem commits immediately preceding synthesis? The current synthesis report isn't using historical data.

This was intended to support a series of taxonomy version pages in the About pages, similar to the existing synthesis release pages. Otherwise the webapp is blind to the history of taxonomy, esp. those versions that weren't used in synthesis. But maybe we should ignore those?

The description of taxonomy reports in the germinator wiki implies a more complete report with this information and much more, so this tiny file might become obsolete. Or it might be a manually-maintained input to those reports, if there's no way to programmatically determine a date for each taxonomy version.

jimallman commented 9 years ago

should I be tagging phylesystem commits immediately preceding synthesis?

Not sure what you're asking here, @pmidford. I've generally tried to knit the different stats together (using dates and dependencies) in the progress page, so that the stats themselves can remain simple.

pmidford commented 9 years ago

No problem, I was just throwing this out trying to understand the previous comment - agree, it's probably better for you to knit this together.

On 2/20/15 11:35 AM, Jim Allman wrote:

should I be tagging phylesystem commits immediately preceding
synthesis?

Not sure what you're asking here, @pmidford https://github.com/pmidford. I've generally tried to knit the different stats together (using dates and dependencies) in the progress page, so that the stats themselves can remain simple.

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/opentree/issues/615#issuecomment-75268500.

jar398 commented 8 years ago

Forgetting about the germinator repo suggestion, it appears this issue is fully addressed by this file:

webapp/static/statistics/ott.json