draftTreeName should be incremented each time the tree changes

OpenTreeOfLife / treemachine

Source tree graph database

Other

16 stars 6 forks source link

draftTreeName should be incremented each time the tree changes #93

Closed jar398 closed 10 years ago

jar398 commented 10 years ago

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json" http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID { "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

chinchliff commented 10 years ago

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json" http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID { "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

jar398 commented 10 years ago

I don't understand what's hard about auto-incrementing. The database is updated via a script. You put the version number in a file. The script reads the file, increments the number, and writes the file. Then the version number is passed as a parameter along with all the other synthesis parameters.

Alternatively, we could use timestamps, which also seem easy. Maybe those are not so nice to use in names, but if we are always talking about the latest version in a synthetic tree series (the series name would also a parameter that a script can communicate to the graphdb) we would never have to pass a timestamp as a parameter in an api call.

The main thing is that someone can record the version when they do some operation, and then if the operation fails when repeated, it will be possible to tell whether the failure can be attributed to a change in version, or has to be attributed to something else. Or, if you are anticipating some wonderful change to the synthetic tree, you can look and tell whether a new synthesis has happened. I've certainly been in the latter situation many times; it's frustrating to have to do experiments or ask Stephen to figure out what version is current. (Ideally I'd like to see some textual release notes explaining what's new in this version. I guess if I had timestamps for old and new versions I could at least look at the gcmdr commit log.)

On Tue, Jun 3, 2014 at 11:02 PM, Cody Hinchliff notifications@github.com wrote:

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json"

http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID { "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45047120 .

blackrim commented 10 years ago

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Stephen On Wed, Jun 04, 2014 at 05:33:08AM -0700, Jonathan A Rees wrote:

I don't understand what's hard about auto-incrementing. The database is updated via a script. You put the version number in a file. The script reads the file, increments the number, and writes the file. Then the version number is passed as a parameter along with all the other synthesis parameters.

Alternatively, we could use timestamps, which also seem easy. Maybe those are not so nice to use in names, but if we are always talking about the latest version in a synthetic tree series (the series name would also a parameter that a script can communicate to the graphdb) we would never have to pass a timestamp as a parameter in an api call.

The main thing is that someone can record the version when they do some operation, and then if the operation fails when repeated, it will be possible to tell whether the failure can be attributed to a change in version, or has to be attributed to something else. Or, if you are anticipating some wonderful change to the synthetic tree, you can look and tell whether a new synthesis has happened. I've certainly been in the latter situation many times; it's frustrating to have to do experiments or ask Stephen to figure out what version is current. (Ideally I'd like to see some textual release notes explaining what's new in this version. I guess if I had timestamps for old and new versions I could at least look at the gcmdr commit log.)

On Tue, Jun 3, 2014 at 11:02 PM, Cody Hinchliff notifications@github.com wrote:

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json"

http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID { "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45047120 .

Reply to this email directly or view it on GitHub: https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45084423

Dr. Stephen A. Smith http://blackrim.org Assistant Professor, Dept. Ecology and Evolutionary Biology University of Michigan 2071A Kraus Natural Science Building 830 North University Ann Arbor, MI 48109-1048

jar398 commented 10 years ago

Fine with me, but only if the generic tree series name is implemented as well (i.e. a tree name 'opentree' or 'draftTree' that refers to different timestamped versions at different times).

If we have a generic name for the tree then we might be able to do away with getDraftTreeID and replace it with getTreeInfo (which would show the timestamp), taking us closer to what I think is your design goal of not having a single distinguished tree in the graphdb, but rather allowing many trees to coexist on equal footing. The front end would only have to know the name 'opentree' and pass it in to each call that needs to specify a tree.

Maybe this design should get a bit of discussion and review through the course of today before any implementation starts, since this is going to be a rather visible change.

On Wed, Jun 4, 2014 at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Stephen

On Wed, Jun 04, 2014 at 05:33:08AM -0700, Jonathan A Rees wrote:

I don't understand what's hard about auto-incrementing. The database is updated via a script. You put the version number in a file. The script reads the file, increments the number, and writes the file. Then the version number is passed as a parameter along with all the other synthesis parameters.

Alternatively, we could use timestamps, which also seem easy. Maybe those are not so nice to use in names, but if we are always talking about the latest version in a synthetic tree series (the series name would also a parameter that a script can communicate to the graphdb) we would never have to pass a timestamp as a parameter in an api call.

The main thing is that someone can record the version when they do some operation, and then if the operation fails when repeated, it will be possible to tell whether the failure can be attributed to a change in version, or has to be attributed to something else. Or, if you are anticipating some wonderful change to the synthetic tree, you can look and tell whether a new synthesis has happened. I've certainly been in the latter situation many times; it's frustrating to have to do experiments or ask Stephen to figure out what version is current. (Ideally I'd like to see some textual release notes explaining what's new in this version. I guess if I had timestamps for old and new versions I could at least look at the gcmdr commit log.)

On Tue, Jun 3, 2014 at 11:02 PM, Cody Hinchliff < notifications@github.com> wrote:

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json"

http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID

{ "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45047120

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45084423

Dr. Stephen A. Smith http://blackrim.org Assistant Professor, Dept. Ecology and Evolutionary Biology University of Michigan 2071A Kraus Natural Science Building 830 North University Ann Arbor, MI 48109-1048

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45085193 .

blackrim commented 10 years ago

yeah, more on this in another email. On Wed, Jun 04, 2014 at 05:51:36AM -0700, Jonathan A Rees wrote:

Fine with me, but only if the generic tree series name is implemented as well (i.e. a tree name 'opentree' or 'draftTree' that refers to different timestamped versions at different times).

If we have a generic name for the tree then we might be able to do away with getDraftTreeID and replace it with getTreeInfo (which would show the timestamp), taking us closer to what I think is your design goal of not having a single distinguished tree in the graphdb, but rather allowing many trees to coexist on equal footing. The front end would only have to know the name 'opentree' and pass it in to each call that needs to specify a tree.

Maybe this design should get a bit of discussion and review through the course of today before any implementation starts, since this is going to be a rather visible change.

On Wed, Jun 4, 2014 at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Stephen

On Wed, Jun 04, 2014 at 05:33:08AM -0700, Jonathan A Rees wrote:

I don't understand what's hard about auto-incrementing. The database is updated via a script. You put the version number in a file. The script reads the file, increments the number, and writes the file. Then the version number is passed as a parameter along with all the other synthesis parameters.

Alternatively, we could use timestamps, which also seem easy. Maybe those are not so nice to use in names, but if we are always talking about the latest version in a synthetic tree series (the series name would also a parameter that a script can communicate to the graphdb) we would never have to pass a timestamp as a parameter in an api call.

The main thing is that someone can record the version when they do some operation, and then if the operation fails when repeated, it will be possible to tell whether the failure can be attributed to a change in version, or has to be attributed to something else. Or, if you are anticipating some wonderful change to the synthetic tree, you can look and tell whether a new synthesis has happened. I've certainly been in the latter situation many times; it's frustrating to have to do experiments or ask Stephen to figure out what version is current. (Ideally I'd like to see some textual release notes explaining what's new in this version. I guess if I had timestamps for old and new versions I could at least look at the gcmdr commit log.)

On Tue, Jun 3, 2014 at 11:02 PM, Cody Hinchliff < notifications@github.com> wrote:

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json"

http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID

{ "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45047120

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45084423

Dr. Stephen A. Smith http://blackrim.org Assistant Professor, Dept. Ecology and Evolutionary Biology University of Michigan 2071A Kraus Natural Science Building 830 North University Ann Arbor, MI 48109-1048

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45085193 .

Reply to this email directly or view it on GitHub: https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45086033

chinchliff commented 10 years ago

One problem to consider is that although the draft trees are essentially a linear series, synthesis trees in general are not. Joseph makes intermediate ones on his machine separately from Stephen, as have I, and so could anyone else. So incrementing the version (or updating the most recent timestamp, or whatever is used to identify the "latest" tree) is something that should probably have to be intentionally triggered. Perhaps obvious and not particularly difficult, but something to keep in mind.

On Wed, Jun 4, 2014 at 8:51 AM, Jonathan A Rees notifications@github.com wrote:

Fine with me, but only if the generic tree series name is implemented as well (i.e. a tree name 'opentree' or 'draftTree' that refers to different timestamped versions at different times).

If we have a generic name for the tree then we might be able to do away with getDraftTreeID and replace it with getTreeInfo (which would show the timestamp), taking us closer to what I think is your design goal of not having a single distinguished tree in the graphdb, but rather allowing many trees to coexist on equal footing. The front end would only have to know the name 'opentree' and pass it in to each call that needs to specify a tree.

Maybe this design should get a bit of discussion and review through the course of today before any implementation starts, since this is going to be a rather visible change.

On Wed, Jun 4, 2014 at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Stephen

On Wed, Jun 04, 2014 at 05:33:08AM -0700, Jonathan A Rees wrote:

I don't understand what's hard about auto-incrementing. The database is updated via a script. You put the version number in a file. The script reads the file, increments the number, and writes the file. Then the version number is passed as a parameter along with all the other synthesis parameters.

Alternatively, we could use timestamps, which also seem easy. Maybe those are not so nice to use in names, but if we are always talking about the latest version in a synthetic tree series (the series name would also a parameter that a script can communicate to the graphdb) we would never have to pass a timestamp as a parameter in an api call.

The main thing is that someone can record the version when they do some operation, and then if the operation fails when repeated, it will be possible to tell whether the failure can be attributed to a change in version, or has to be attributed to something else. Or, if you are anticipating some wonderful change to the synthetic tree, you can look and tell whether a new synthesis has happened. I've certainly been in the latter situation many times; it's frustrating to have to do experiments or ask Stephen to figure out what version is current. (Ideally I'd like to see some textual release notes explaining what's new in this version. I guess if I had timestamps for old and new versions I could at least look at the gcmdr commit log.)

On Tue, Jun 3, 2014 at 11:02 PM, Cody Hinchliff < notifications@github.com> wrote:

It seems like there is some complexity here. If we want versioning then either we need auto-incrementing (not straightforward) or forced manual version designation on synthesis (tedious). Is there a better solution I'm not thinking about? I don't have a problem with either one, just considering implications. Also, how do we know which tree to alias? This is something that seems it will also have to be manually set unless we have auto-incrementing set up... What are you thinking of for mechanisms/conventions here?

On Tuesday, June 3, 2014, Jonathan A Rees notifications@github.com

wrote:

Even if there's only one tree in the db, we still need versioning, just to stay sane. At least I think so. And as another issue says there should be an alias that is fixed for all versions but refers to the current version.

curl -X "POST" -d "{}" -H "Content-type: application/json"

http://api.opentreeoflife.org/treemachine/ext/GoLS/graphdb/getDraftTreeID

{ "startNodeOTTId" : 93302, "draftTreeName" : "otol.draft.22", "startNodeTaxName" : "cellular organisms", "startNodeID" : 215

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93.

— Reply to this email directly or view it on GitHub <

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45047120

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45084423

Dr. Stephen A. Smith http://blackrim.org Assistant Professor, Dept. Ecology and Evolutionary Biology University of Michigan 2071A Kraus Natural Science Building 830 North University Ann Arbor, MI 48109-1048

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45085193>

.

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45086033 .

jimallman commented 10 years ago

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

=jimA=

Jim Allman Interrobang Digital Media http://www.ibang.com/ (919) 649-5760

jimallman commented 10 years ago

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

jar398 commented 10 years ago

I prefer version numbers as well. In other projects I've just put them in a file.

Guessing most people have seen this... http://semver.org/

On Wed, Jun 4, 2014 at 12:52 PM, Jim Allman notifications@github.com wrote:

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117130 .

blackrim commented 10 years ago

that is fine. however, i think cody raises a real issue related to how this would work with local making of synth on each machine. when does the version number increment? what about when people make synth on their own? there seems to be only one instance where the version is required and that is the central synth. maybe there is a way you are thinking this would work that i am missing though On Wed, Jun 04, 2014 at 09:55:56AM -0700, Jonathan A Rees wrote:

I prefer version numbers as well. In other projects I've just put them in a file.

Guessing most people have seen this... http://semver.org/

On Wed, Jun 4, 2014 at 12:52 PM, Jim Allman notifications@github.com wrote:

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117130 .

Reply to this email directly or view it on GitHub: https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117985

jar398 commented 10 years ago

I thought the trees in the universe would be organized into a set of series. Each series would have a file containing a version number. Different people would make versions of different series. Different series would have different names. Series names could be allocated centrally (e.g. by you or me) or using DNS for mutex the way W3C likes you to. Or people could just pick series names and tough luck if there's a collision, which is how software names work (e.g. git, python, etc.).

If you build on different machines, you could either use different series names, or share a name and shuttle the version number file around.

If you just want to do a one-off tree, give it any series name you like (timestamp, uuid, etc.), and an arbitrary version.

On Wed, Jun 4, 2014 at 1:02 PM, Stephen Smith notifications@github.com wrote:

that is fine. however, i think cody raises a real issue related to how this would work with local making of synth on each machine. when does the version number increment? what about when people make synth on their own? there seems to be only one instance where the version is required and that is the central synth. maybe there is a way you are thinking this would work that i am missing though

On Wed, Jun 04, 2014 at 09:55:56AM -0700, Jonathan A Rees wrote:

I prefer version numbers as well. In other projects I've just put them in a file.

Guessing most people have seen this... http://semver.org/

On Wed, Jun 4, 2014 at 12:52 PM, Jim Allman notifications@github.com wrote:

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117130

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117985

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45119497 .

blackrim commented 10 years ago

ok. i am guessing for most users, there won't be a series or versions but most will be one offs (testing and playing -- including the dozens and dozens of synths that we do to test the files) so i just want to make sure that these are not made more challenging by versions. i am wondering whether there might be a better way. (just thinking if there were global versions for phylogenies produced by other phylogenetics programs -- seems a little odd -- maybe we can just apply an id after the fact to be able to identify the tree when we want to communicate?)

On Wed, Jun 04, 2014 at 10:14:25AM -0700, Jonathan A Rees wrote:

I thought the trees in the universe would be organized into a set of series. Each series would have a file containing a version number. Different people would make versions of different series. Different series would have different names. Series names could be allocated centrally (e.g. by you or me) or using DNS for mutex the way W3C likes you to. Or people could just pick series names and tough luck if there's a collision, which is how software names work (e.g. git, python, etc.).

If you build on different machines, you could either use different series names, or share a name and shuttle the version number file around.

If you just want to do a one-off tree, give it any series name you like (timestamp, uuid, etc.), and an arbitrary version.

On Wed, Jun 4, 2014 at 1:02 PM, Stephen Smith notifications@github.com wrote:

that is fine. however, i think cody raises a real issue related to how this would work with local making of synth on each machine. when does the version number increment? what about when people make synth on their own? there seems to be only one instance where the version is required and that is the central synth. maybe there is a way you are thinking this would work that i am missing though

On Wed, Jun 04, 2014 at 09:55:56AM -0700, Jonathan A Rees wrote:

I prefer version numbers as well. In other projects I've just put them in a file.

Guessing most people have seen this... http://semver.org/

On Wed, Jun 4, 2014 at 12:52 PM, Jim Allman notifications@github.com wrote:

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117130

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117985

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45119497 .

Reply to this email directly or view it on GitHub: https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45121342

blackrim commented 10 years ago

another thought. i think it might be best for someone (gcmdr?) to house the current version and that when we are ready for a new version, we apply an autoincrement above the last version to a synth tree in the database. this would be more rare for us. of course the synth trees can have any ids and people can do whatever they like for those, but we would have the major trees versioned like this. does this work for people? On Wed, Jun 04, 2014 at 01:19:46PM -0400, Stephen Smith wrote:

ok. i am guessing for most users, there won't be a series or versions but most will be one offs (testing and playing -- including the dozens and dozens of synths that we do to test the files) so i just want to make sure that these are not made more challenging by versions. i am wondering whether there might be a better way. (just thinking if there were global versions for phylogenies produced by other phylogenetics programs -- seems a little odd -- maybe we can just apply an id after the fact to be able to identify the tree when we want to communicate?)

On Wed, Jun 04, 2014 at 10:14:25AM -0700, Jonathan A Rees wrote:

I thought the trees in the universe would be organized into a set of series. Each series would have a file containing a version number. Different people would make versions of different series. Different series would have different names. Series names could be allocated centrally (e.g. by you or me) or using DNS for mutex the way W3C likes you to. Or people could just pick series names and tough luck if there's a collision, which is how software names work (e.g. git, python, etc.).

If you build on different machines, you could either use different series names, or share a name and shuttle the version number file around.

If you just want to do a one-off tree, give it any series name you like (timestamp, uuid, etc.), and an arbitrary version.

On Wed, Jun 4, 2014 at 1:02 PM, Stephen Smith notifications@github.com wrote:

that is fine. however, i think cody raises a real issue related to how this would work with local making of synth on each machine. when does the version number increment? what about when people make synth on their own? there seems to be only one instance where the version is required and that is the central synth. maybe there is a way you are thinking this would work that i am missing though

On Wed, Jun 04, 2014 at 09:55:56AM -0700, Jonathan A Rees wrote:

I prefer version numbers as well. In other projects I've just put them in a file.

Guessing most people have seen this... http://semver.org/

On Wed, Jun 4, 2014 at 12:52 PM, Jim Allman notifications@github.com wrote:

On Jun 4, 2014, at 12:50 PM, Jim Allman jim@ibang.com wrote:

On Jun 4, 2014, at 8:41 AM, Stephen Smith notifications@github.com wrote:

I think I prefer time stamps to the version number. That seems more natural to me. If that isn't ok, I can make it so that it is passed as an argument. Is a time stamp ok?

Agreed that timestamps would work well, if they’re sortable and human-readable.

Argh, having second thoughts. Version numbers would probably be easier to share than a fine-grained date/time string, both verbally and in emails, configs, etc.

=jimA=

— Reply to this email directly or view it on GitHub < https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117130

.

Reply to this email directly or view it on GitHub:

https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45117985

— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45119497 .

Reply to this email directly or view it on GitHub: https://github.com/OpenTreeOfLife/treemachine/issues/93#issuecomment-45121342

mtholder commented 10 years ago

At the risk of sounding like a parody of myself (in my "the solution is always git" persona)...

It seems like one would want to:

tweak configuration files that tell gcmdr how to build the synthetic tree,
build the tree
commit the changes to gcmdr config files.
run a script that:
1. syncs config files with github,
2. uses a git tag to provide an incremented version number
3. pushes the graph db and the version # to the dev servers.

That way:

versions increment nicely (syncing w/ GitHub before tagging prevents version # clashes)
You only create a user-friendly version # when you need one (when you are posting to a shared servers).
The provenance for the build is clear because the version numbers are all tags in the repo that did the tree creation.