Closed chanskw closed 8 years ago
The other problem is that when clients look at IBMStreams Github site, it is hard to see which project has matured enough and is ready for production
I think that's actually a separate problem, the github site is aimed at people developing toolkits, not people who just want to use the toolkits. It's what I was trying to raise in issue #66 , but there doesn't seem to be any interest in that.
I think it is a related problem. Github is aim at people developing as well as using the toolkit that is being developed.
I believe tagging based on project maturity is a first step towards the usage problem. I will think of something for issue #66 and write proposal there.
Just to be clear, I meant that the current state of the IBMStreams github site is not friendly towards people who just want to use toolkits and don't care about the development of them.
I agree that we should have some indication of a toolkits production readiness and that classification of incubating vs production would help.
The proposed process for having incubating branches within production toolkits seems unwieldy to me and would be complex to release multiple separate incubating features, to handle features which do not reach production readiness before the next production release, and introduce potential confusion on which toolkit release should used.
I'd propose the ability to indicate any features in a production toolkit which are experimental/incubating so users know if they are using a feature which is not at production level yet. We could also provide releases which do not include incubating features if necessary.
For the issues which Dan is raising I'd propose an overall io page which catalogs the toolkits which is user friendly for someone who p[urely wants to consume the toolkits and not develop them. It could include a summary of toolkit's purpose and release level, production readiness etc. It could be categorized based on the type of function or simply by production readiness.
Mike - I like the idea to able to mark features in the toolkit to be experimental. I think this is something we should think about adding to operator model and at the language.
Aside from this, I still think there needs to be some incubation branch for features. Just to be clear, I do not expect features to be in incubation branch for a long time. It will be a short period of time when the feature is originally developed and then when we decide it's ready to be integrated to production. Most of the times, it is ok to deliver major changes to master branch, except when the toolkit is about to be released.
The other part of the problem is that the toolkits do not have clear release plan, so it is hard for the community to contribute and plan out the feature to be included in a release.
Perhaps we should have something like this:
What do you think abut this?
"release plans" seem to be moving away from the agile approach release early, release often.
Seems like how projects manage releases is up to the committers, not any top-down rule.
While I understand the idea behind incubation/production projects, I'm not sure I understand why we can't just use releases as the mechanism, pre-release, or not a pre-release. Most of the description of the production projects just seems like standard software development.
I agree with the issue @ddebrunner raises--some way to point people who only want to use toolkits to an easy-to-download link would be very helpful.
Also, If I were in the management committee, and I were applying the criteria you've listed based on what's in the repositories, I'd vote no on at least three out of four of the proposed production toolkits because there's very little testing in the repositories themselves. streamsx.messaging
has only a kafka tests, streamsx.inet
has some http tests, but no InetSource test, streamsx.hdfs
has no tests for HDFS2FileSource, HDFS2FileSink, or HDFS2DirectoryScan.
Having a release plan does not stop us from being agile, release early and release often.
For example, in your release plan, you can release once every month. Where you can have pre-release at the end of each month. And at a certain point in time, have a planned official release.
It is up to the project to figure out when you want to release and how often. The main point is that it needs to be communicated externally so people know what to expect.
Kris I disagree... all of the those toolkits have tests. While they are not open-source but tests are available and we make significant effort to fully test all of the operators in those toolkits to make them production ready.
I think the point you are trying to bring up is that we did not open-source the tests. I agree it is an issue, but that's a separate problem to solve.
+1 for the original proposal. There needs to be a level of quality control over the toolkits/repositories that are pulled into the product. Agile development is not an excuse for contributing to the shipped product on an ad-hoc basis. Maintaining quality and being agile are not mutually exclusive concepts and both can be achieved simultaneously.
Having an incubation branch allows for continuous development of new features while providing the toolkit stakeholders an opportunity to comment/review throughout their development. Close collaboration with all stakeholders is key to the agile process.
@cancilla All that is true, but why is there a need for two classes of project? Anyone coming to github can use any toolkit in their production app, it's all released as "You are solely responsible for determining the appropriateness of using or redistributing the Work". IBM may have additional requirements for including a toolkit in the IBM Streams product, but that's not really relevant to the open source github site.
An "incubation" branch just sounds like a feature development branch, standard git style development.
Having a standard for the maturity/production readiness of a toolkit and feature which can be used by a user of all toolkits is something we should specify at an organization level. Having incubating projects and being able to mark features as experimental are widely used concepts and I hope we adopt that practice.
I do not agree with being too detailed or prescriptive on specifying the use of an incubating branch. It may be natural to create branches for features which are planned for future releases, where there could be a number of branches with disjoint features. In other cases all new features may be worked on a shared dev branch or on individual developer branches. We could write up some guidelines or suggestions but I think that the specific branching strategy used is up to the committers on the project. Similarly I think we could encourage publishing a roadmap when we have specific intentions/desires or working on things for an upcoming release but it should be at the discretion of the committers for each project and not a requirement.
@chanskw I agree that the toolkits you proposed have a good set of tests. My wording was vague because I didn't know if you would be okay with us saying we had internal tests.
The process you've described has the management committee doing a public vote on a public proposal with public criteria, but using hidden information (the tests, in this case) to do make their decision. That may be unavoidable lapse in transparency, but let's at least be clear that we're counting internal tests visible only to Streams team members, so people don't think that no tests are needed.
+1 on @mikespicer last comment.
+1 on mark features in the toolkit to be experimental. +1 on tagging toolkits as production/incubation - as tagging only. I agree with @ddebrunner that release/pre-release mechanism should be enough to clarify which release is stable. Open Source comes with an agility, that mostly good, but lacks a stability of standard products. As an example: I've updated MongoDB toolkit to use latest C++ driver and suddenly discovered that connection pool API was dropped. No deprecation, no warning, just like that. This is one of the popular NoSQL databases!
Thank you for all your feedback.
From what I have gathered, I have enough +1 votes on the following: 1) Project incubation process - The feedback I have got from @ddebrunner is that we should not use the name "Production Project" in the process. The incubation process is to make sure that projects are viable, not to ensure the "quality" of the project and that it is "production-ready" Therefore, I will be updating the project incubation process as follows. We have two types of project: Incubation Project and Project. I will update the criteria to reflect that this process is to ensure that the project is viable.
2) +1 on marking features / operators as experimental
Based on the feedback I got, incubation branch should not be implemented at the top level. Instead, if project deems that it is necessary to make sure that the master branch is stable, then it can be implemented at the project level. I will try to implement this at the project that we feel like it's necessary first. And then write them up as recommendations in case other projects would like to follow.
I also feel that the feature incubation process is feature-dependent. In some cases, it makes sense to develop risky and big feature off an incubation branch. In other cases, it may be sufficient if we have community votes. The key thing I would like to bring out is that, I am not trying to create a lot of obstacles to integrate features into the master branch. However, I believe, with a well-defined light-weight process will help us deliver high-quality code in the open, while we remain agile. I do not think this kind of process is currently in place. But I agree that this can be implemented at the project level where committers decide what is the best thing to do.
Please let me know if I misunderstood anything.
I have updated process page. Please vote or let me know if you disagree with anything. https://github.com/IBMStreams/administration/wiki/Draft:-Project-Incubation-Process
Thanks!
I also feel that the feature incubation process is feature-dependent. In some cases, it makes sense to develop risky and big feature off an incubation branch. In other cases, it may be sufficient if we have community votes.
I'm afraid I don't understand what this is trying to say.
How would an "incubation branch" be different from a feature branch? I'm not sure we need to invent new workflow types for git based projects. https://www.atlassian.com/git/tutorials/comparing-workflows/feature-branch-workflow
It's the same thing. It just the wording. The only idea is that for risky thing, sometimes we may want to put it off a separate branch for us to have a chance to work on it before integrating.
+1 on updated proposal
Thanks James -
I will keep voting open until Aug 26. Please vote or object by then. Thanks very much.
I agree that features should generally not be developed in the master branch.
But I'm not sure what you're proposing to change with regards to feature or incubation branches. Our current directions do suggest use what's effectively a feature branch development process, just using a fork instead of a branch (one, two).
Is there some reason you want contributors who have push access to use branches in the IBMStream repository rather than working in their own forks? I find using forks is easier particularly for new users, and I think it works equally well for evaluation as long as there is only one outstanding feature. Is there something I'm missing?
Hi Kris -
This vote is for project incubation process only. The branching and feature integration will be discussed at project level.
Thanks...
Project have at least one official and stable release Project has been officially tested (functional, stress, performance tests, etc.) Project has an official website that contains information and documentation of the project (e.g. github.io pages)
I'm not in favor of "official" in this context, as I have no idea what it means (who makes anything "official"?), suggest:
Project have at least one release (not tagged as pre-release) and follows semantic versioning (http://semver.org/) Project has sufficient tests (functional, stress, performance tests, etc.) Project has a website that contains information and documentation of the project (e.g. github.io pages)
@ddebrunner Sure.. we can change that...
Draft updated with @ddebrunner 's comments.
Looks good except that the term "production" project is still used 3 times in the description. +1 on the basis that production will be removed/replaced. I hope that we can address a mechanism for marking features as experimental in a separate issue.
arrrggh! I thought I cleared all of it. I went through it again.. and there should be no more production word on the process. Thanks!
+1
With the wording changes, I'm no longer worried about the testing transparency issue I raised earlier.
But I don't know what you mean by "add a tag." I looked for a github mechanism to tag a project and didn't see one. Did I miss the mechanism, or does "tagging" mean mentioning it is in incubation in the description or readme?
The following projects are initial set of IBMStreams projects: streams.hdfs streamsx.messaging streamsx.hbase streamsx.inet samples IBMStreams.github.io benchmar tutorials adminstration
I have only added an (Incubation) tag at the project description for now. Will think of a better way to identify incubation projects.
Please follow process as documented on wiki to be graduated from incubation. I suggest the following to go through this graduation process in the near future:
streamsx.topology streamsx.network streamsx.dp streamsx.sparkmllib resourceManager
Closing...
I would like to introduce a new process about project and feature incubation.
IBMStreams Github projects provide a platform for us to develop and prototype new ideas quickly. It allows us to be agile in delivering new capabilities to our customers.
At the same time, some of these projects are used by clients in production environment. In those cases, we need to be more careful in terms of how the project can evolve. (e.g. maintaining API compatibility, making sure that new features stable when it is integrated in the main branch.)
The problem I am trying to solve is - how do we maintain project stability, while we continue to be agile?
The other problem is that when clients look at IBMStreams Github site, it is hard to see which project has matured enough and is ready for production... and which is really in the early development stage.
Please see this process proposal: https://github.com/IBMStreams/administration/wiki/Draft:-Project-and-Feature-Incubation-Process
Please provide feedback for the proposal.
If this process is accepted, I would also like propose the following projects to be tagged as production projects:
Thanks...