Support node ids as strings in treemachine (pre-tm-lite) to avoid flag day

jar398 commented 8 years ago

Having a flag day where the synthetic tree API and its clients must change at the same time would be a disaster. We can help clients prepare for the switch from integer to string by supporting string node ids in treemachine prior to advancing to tm-lite.

This is not the same as adopting mrca-based node ids; I'm just saying find a way to use "123" or "node123" instead of 123.

Where the node id is a parameter to a method, this should be easy - clients can accept 123 and "123" and treat them the same. Where it's a result, there has to be something in the request that triggers the generation of strings rather than integers. This could either be the URL (v3/ vs. v2/), the method name, or the POST payload (like "use_string_node_ids": true); I don't think it matters very much which solution we pick.

mtholder commented 8 years ago

Whenever we deploy tm-lite rather than treemachine, we are going to lose the whole graph part of the API. So it sort of seems unavoidable that we won't be backward compatible.

jar398 commented 8 years ago

I don't think you understood what I was proposing; I didn't say anything about backward compatibility. Technically we're talking about forward compatibility, and this idea matters for development in the period leading up to the switch to tm-lite, not after. A programmer can write code and deploy at their leisure in the period leading up to the loss of integer node id support and if they do there will be no interruption of service. There is obviously no way to do that for methods that are disappearing altogether. So the graph methods are irrelevant here.

mtholder commented 8 years ago

but we aren't going to be able to support "use_string_node_ids" = false when we deploy tm-lite, though. So as soon as we deploy that, it will not be backward compatible with any one who uses that option.

I just don't think many clients will have a hard time writing a version of client code that makes it easy to switch to strings as node designators.

Adding more complexity to the current implementation and API so that clients can use either ints or strings for a week or 2 before we switch to using tm-lite (and strings become the only game in town) does not seem like a good use of our time.

jar398 commented 8 years ago

Para 1 is irrelevant, because I am not talking about backward compatibility and never suggested false would be valid there. Para 2 I assess differently: the tree browser is an important client and I think debugging the new system without this hack will be awful, if only because of scheduling and coordination - there are many plates in the air right now and this group is used to operating without much planning or coordination. And para 3 is about allocating my time (since I'm the one who would be doing the work).

The next step would be scoping out just how much work this involved. That's something I don't know. If it's hard I probably wouldn't do it.

mtholder commented 8 years ago

The proposed hack makes it no easier to update the tree browser. For that step to be completed, someone (probably @jimallman ) will have to move the tree browser over to using strings as node designators. Having a special one-off version of tm that makes some of the old int-based calls work only hides the client calls that will be bugs in a few weeks.

I see no reason why changing tm is easier than

deploying tmlite to devapi,
announce that clients need to work with that API.
fixing our core services.

When the core services (e.g. the browser) work on that version of the API, we are ready to make it live.

If this was 6 months ago, and we were giving external developers plenty of warning with a message of "for a while either int or string will work, but in a few months only strings will work" then I'd agree that this might be worth the effort. It would give people a chance to migrate in a piecemeal fashion. But at this stage, I can't imagine how it could be a good use of programmer-time, because we are going to need a complete switch over of clients to the new system.

jar398 commented 8 years ago

It's about testing. If you deploy A and B in lockstep ('flag day'), and there is an error in their interaction, it is harder to localize the source of the error than if they were deployed and tested in sequence. Debugging may require coordinating two programmers, and coordination is expensive and risky.

In the sequence you propose the system is unavailable between the end of step 1 and the end of step 3. Maybe that is OK for the dev system, if downtime is only a couple of days, but it is risky and uncomfortable.

It is clearly a tradeoff. If the hack is easy to do there is no reason not to do it. If it is hard then that is a reason not to do it.

mtholder commented 8 years ago

"Maybe that is OK for the dev system, if downtime is only a couple of days, but it is risky and uncomfortable."

My suggestion was to deploy to dev, and No - It is not risky to have tree browsing not working on dev. That is the point of having a dev system - to test things before they are deployed to production.

josephwb commented 8 years ago

What is the consensus on this @kcranston @jar398? @mtholder seems against it, and I agree; seems like annoying backwards step for what will amount to a brief time window. I'll do whatever, of course.

jar398 commented 8 years ago

@josephwb If this were done I would not ask you to work on it, I would probably do it myself.

The time for doing this the right way has passed for a number of reasons.

OpenTreeOfLife / treemachine

Support node ids as strings in treemachine (pre-tm-lite) to avoid flag day #200