w3c / modern-tooling

Work of the modern tooling task force
http://w3c.github.io/modern-tooling/
MIT License
44 stars 39 forks source link

Github: No folders for repos #32

Open timbl opened 9 years ago

timbl commented 9 years ago

The flat space of repositories for each organization for not allow one to put groups of similar or related repos together. This makes navigation and use of github in general a pain as you have a lot of repos. We could build a a hierarchical or other structure on top of github which adds this functionality.

darobin commented 9 years ago

I agree we need to capture structure and that flatness is difficult; but I'm a lot less sure about hierarchical structure. Hierarchies aren't very good at modelling reality, and even worse at mapping to how humans find information.

As I said in #16 I think we should start flat and enable (non-hierarchical) grouping. I think that the grouping should be enough to make the system usable. This is sort of a classic storage vs interface issue. We need the one-repo-per-spec system because git branches work across a full repo and things get weird when you have branches for one product applying to another. This does not however prevent us from making the content sortable into higher-level structures; including possibly the option of providing navigation of our repositories using the existing filters.

plehegar commented 9 years ago

Note that IETF is using different approach: they have one org per group, such as https://github.com/httpwg . It also allows them to resolve the pb of admins. cc @mnot

darobin commented 9 years ago

That's an interesting idea. We'd need to set up W3C to have access but that's not hard. @mnot: do you have feedback on how it works in practice?

mnot commented 9 years ago

We're just starting; httpbis and tlswg have used github for a while, and the IAOC is starting to talk about supporting/using github officially (we just met with them about it a few weeks ago in Dallas).

We were debating the merits of an "IETF" org vs. one-per-WG, and the factors that convinced us the latter was the right choice were:

So far it works really well for HTTP and TLS; the feedback is absurdly positive. Not sure how it works on an org-wide basis yet, of course.

tobie commented 9 years ago

Remember that WG are organized around IP issues more than topics and that as such they make little sense for a layperson.

mikewest commented 9 years ago

WebAppSec is fairly successfully using a single repository for the majority of its specs (https://github.com/w3c/webappsec). This is particularly valuable for specifications that have cross-dependencies, as all the relevant bits can be updated in a single commit.

I suspect that we're an anomaly, however. This model clearly wouldn't work for a group like WebApps.

anselmh commented 9 years ago

I do really like having all w3c specs on one place. You can organize an github organization by teams which I’d say would be appropriate but I wouldn’t like seeing different accounts for the specs. It would be hard to find out which accounts are official and which not.

yoavweiss commented 9 years ago

I think it'd make it significantly easier to find specific specs if each WG had its own org, or alternatively (like WebAppSec as @mikewest mentioned) all the specs under a single repo. The current situation where there's no such convention makes it so that the w3c org has ~200 repos under it, which is a mess.

tobie commented 9 years ago

Priority of constituencies implies we should cater for devs' needs before catering for those of implementor's or spec editors. Clearly, the way WG are organized doesn't reflect topics (as mentioned before, it has more to do with IP concerns). Furthermore, some WG are huge, while others are very small. It seems that the WG are a W3C implementation detail that should be hidden from devs as much as possible in order to offer them a more coherent and cohesive vision of the Open Web platform.

Ideally, this would be done through GitHub itself. For example, I'd imagine we could have a single W3C specs org, with repos names by their shortnames. This org would be distinct from other W3C related projects there might exist (tests, OS code, etc.). A single repo for all specs (as proposed by @mikewest above) is another interesting solution.

xquery commented 9 years ago

to add a datapoint, ever since we (XML Proc WG) migrated to github we've had a nice bump in productivity and found we could easily use a single repo to generate different interrelated specs

spec repo https://github.com/xproc/specification output https://xproc.github.io/specification/

We did debate single vs multiple at some length, with the current single repo winning out.

The github working approach works well for our purposes (and those of the community of the willing to contribute). However, I think overloading github could reduce flexibility (both ways); completely agree with the proposal of layering hierarchies that cater to other points of view.

iherman commented 9 years ago

Just like @mikewest the CSVW Working Group has also chosen the single repo model for several (five) documents (https://github.com/w3c/csvw). The reason why it worked well, I believe, is because the documents to be developed are very strongly related, they share a bunch of data (using respec's inclusion facilities), etc. We did not really experience problems with PR-s and branches overlapping various documents.

This became particularly helpful when handling issues. Many issues had overlaps on several documents. Separate repos, ie, separate issue lists for the different documents, would have become a real hurdle I believe.

I guess there is no one approach that fits them all. Some working groups are really supergroups, in which case separate repos (maybe under one organization) make a lot of sense. But for other groups a single repo is ideal.

tobie commented 9 years ago

@iherman:

We did not really experience problems with PR-s and branches overlapping various documents.

There is absolutely no reason that this should be an issue in a repo containing more specs. This isn't an issue in w3c/web-platform-tests.

tobie commented 9 years ago

The more I think about this, the more I like @mikewest's single repo for all specs approach.

mnot commented 9 years ago

Single repo for all specs is what httpbis and tlswg does (although some people still like repo-per-spec, we beat them down).

One thing we've found useful in doing that is "git mv {spec} archive/" once it isn't being actively worked upon; that saves history and reduces clutter.

That said, I do question whether putting all w3c specs in a single repo is really serving the needs of devs; if they want to participate lightly (e.g., make pull requests), working with a huge repo is yet another barrier to entry.

It also seems like it forces WGs to conform to one set of practices. While that can be seen as serving the needs of devs (through consistency), it also empowers those who set those practices, which is yet another constituency (that isn't listed in the priorities :).

Anyway, I think the question here is slightly different -- it's whether you do orgs on WG or whole-of-w3c lines.

frivoal commented 9 years ago

No strong opinion on one GitHub org per group vs for the whole w3c.

On the other hand, I think it is very desirable to have one repo for all specs of a WG, not one per spec. I am sure some WGs differ, but taking the CSSWG as a reference, it would be troublesome to have one repo per spec.

Here are some things we do occasionally:

When all specs of a WG are in the same repo, the history of the various pieces of text can be traced across spec migrations, and that's desirable.

We also occasionally make a single commit that atomically affects several tightly coupled specs.

The repo also has a few files that are reused across specs. These change infrequently and could be duplicated, or we could use sub-modules for that, but this would introduce complications for no clear benefit.

Also, for a group with this many specs, starting a new spec is not that rare, and creating a new directory in an existing repo is much lower overhead than creating a repo. It is less work, there's no room for mistake in terms of setting up commit hooks or the web server or the twitter integration properly, any editor with commit access to the repo can do it, without having to depend on people with special admin rights...

Branches including specs that they don't affect isn't a problem any more than branching a single spec to modify one paragraph will also include the paragraphs you didn't change. So what? This isn't a source of merge conflicts.

therealglazou commented 9 years ago

Agreeing entirely with what @frivoal said just above.

iherman commented 9 years ago

On 10 Apr 2015, at 09:08 , Florian Rivoal notifications@github.com wrote:

No strong opinion on one GitHub org per group vs for the whole w3c.

I am afraid this is not only a technical issue but a branding one, too. Giving a visible 'face' to W3C on github has its value, at least for me. (I know that other organizations, like IDPF, or other major projects, like RDFLib, do the same, and it works well)

There is also a practical advantage of the current system that I appreciate. Because I am on the 'W3C' organizational level, I get automatically subscribed on all repos that are created in the W3C organization. It may look like a nuisance, because, frankly, most of the new repos are not of interest for me and, therefore, I have to 'unwatch' them to avoid extra emails. But that is a small price to pay to be notified when a new really interesting repo is created and I can then follow the discussion. This is how I got to the one we are just discussing, b.t.w.:-)

My feeling is: "ain't broken..."

I fully agree with everything below. I am in two different groups. In one group I was a complete github newbie and we set up one repo per document; for another one we set up a repo for the group. I fully appreciate the advantages of the latter over the former (luckily, the group with several repos is not very active on github, so I do not pay a heavy price...)

Ivan

On the other hand, I think it is very desirable to have one repo for all specs of a WG, not one per spec. I am sure some WGs differ, but taking the CSSWG as a reference, it would be troublesome to have one repo per spec.

Here are some things we do occasionally:

• Take part of a spec, and move it to the next level of the same spec • Take part of a spec, and move it to another spec • Take one spec, and split it in several smaller modules • Take several smaller modules, and merge them into one spec When all specs of a WG are in the same repo, the history of the various pieces of text can be traced across spec migrations, and that's desirable.

We also occasionally make a single commit that atomically affects several tightly coupled specs.

The repo also has a few files that are reused across specs. These change infrequently and could be duplicated, or we could use sub-modules for that, but this would introduce complications for no clear benefit.

Also, for a group with this many specs, starting a new spec is not that rare, and creating a new directory in an existing repo is much lower overhead than creating a repo. It is less work, there's no room for mistake in terms of setting up commit hooks or the web server or the twitter integration properly, any editor with commit access to the repo can do it, without having to depend on people with special admin rights...

Branches including specs that they don't affect isn't a problem any more than branching a single spec to modify one paragraph will also include the paragraphs you didn't change. So what? This isn't a source of merge conflicts.

— Reply to this email directly or view it on GitHub.


Ivan Herman, W3C Digital Publishing Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID: http://orcid.org/0000-0003-0782-2704

tobie commented 9 years ago

That said, I do question whether putting all w3c specs in a single repo is really serving the needs of devs; if they want to participate lightly (e.g., make pull requests), working with a huge repo is yet another barrier to entry.

If specs are organized by shortname (like in web-platform-tests), I don't see how that's a possible barrier to entry. Furthermore, it should make it easier for devs to contribute to other specs after an initial contribution.

While that can be seen as serving the needs of devs (through consistency), it also empowers those who set those practices, which is yet another constituency (that isn't listed in the priorities :).

I see what you mean, here, but ultimately, either that decision is (vaguely) consensual or it doesn't happen (and W3C's trademark inconsistency just spills over into GitHub).

frivoal commented 9 years ago

@iherman Agreed. I can see benefits to both approaches when it comes to GitHub org, which is why I marked myself as not strongly opinionated on this question, but the branding/discovery argument you and @anselmh put forward is convincing to me.

So my position is:

therealglazou commented 9 years ago

Yes. The CSS WG repo itself is already 174megs with 90 directories. I'm afraid a W3C repo would be enormous and complex.

frivoal commented 9 years ago

Right. I don't think the arguments for putting all the specs of a WG in the same repo apply to putting all the W3C specs in the same repo: 1) coupling is much looser across groups, and cross spec operations very rarely cross WG boundaries, so cross-spec history tracking benefits don't really apply 2) different WG have different processes and tooling, and while some convergence based on recognized based practices is desirable, putting everything in the same repo would force (through commit hooks and other integrations) more convergence that generally agreeable.

Side note to @therealglazou : run "git gc --aggressive", you'll go down to 130megs or so.

tobie commented 9 years ago

Again, the problem is with one repo per WG is that the WG boundaries are IP-based. Why is geolocation in one WG and the permissioning in another? Because of IP concerns. Should devs know or care? Certainly not.

WGs are artificial constraints imposed by lawyers. They add overhead, hinder a holistic approach to API design, security and accessibility, and make W3C (and the Web platform) less approachable.

The thriving community around the web-platform-tests repository shows a flatter approach works.

frivoal commented 9 years ago

While it is true that IP rules influence the existence and boundaries of Working Groups, I disagree that they are, in the general case, the primary base of these boundaries. Many WG would still make complete sense even in the absence of these IP constraints: CSS-WG, SVG-WG, CSVW-WG...

For these, which I believe are the majority case, it makes perfect sense to have one spec repo per WG, as the specs form a tighly coupled bundle, and are developed using a common set of tools and practices.

However, if specs that are developped in separate WG (for IP reasons or otherwise) do belong together from a technical standpoint and are developed using a common set of tools and practices, then presumably the WG can be (either officially or in practice) considered as a task force, and we could have a task force specific repo rather than a wg repo for these. This is arguably what's happening with the houdini or fx task forces.

Conversely, if one Working Group is working on a set of specs that don't really belong together, then it may be reasonable to have multiple repos, one for each cohesive subset of the specs they are working on.

tobie commented 9 years ago

So does CSSOM belong in WebApps with DOM or in CSSWG?

frivoal commented 9 years ago

For better or worse, It's maintained by the csswg, and follows our sets of practices and uses our tools, so csswg. Sure, not everything is clear cut, and I am not saying that it would make no sense to have everything in a single repo. It would have some upsides. But I think they are less pronounced compared to the upsites of a repo per WG, and the downsides are more pronounced.

tobie commented 9 years ago

The point I'm trying to make is that from a dev's perspective, which WG maintains is should be irrelevant.

therealglazou commented 9 years ago

I remind you all that all W3C specs have a short name that in most cases (all cases?) allows to know the originating WG...

wseltzer commented 9 years ago

@tobie would that IP were irrelevant, but until it is, chairs also need to keep WG boundaries in mind in managing contributions -- watching that only WG participants are making contributions, or getting patent commitments from WG non-participants who want to contribute.

tobie commented 9 years ago

@wseltzer seems the system we built for web-platform-tests should mostly cover that.

…and to clarify, I'm not suggesting IP is irrelevant, just that it's irrelevant to Web developers.

jgraham commented 9 years ago

FWIW I think the strong technical reasons to keep all the tests in a single repository (consumers of the testsuite should be running all tests, want to build a single manifest of all tests, tooling is shared between tests, etc.) largely don't apply to individual specs, so I wouldn't use that as a model.

OTOH the tests experience does show the dangers of Galápagos Syndrome when individual groups develop unique tooling, processes, and procedures in isolation, resulting in multiple incompatible systems that have to be supported going forward.

marcoscaceres commented 9 years ago

The single-repo-for-multiple-specs-model is a terrible idea. It makes it impossible to follow single specs without jumping through hoops. It also defeats the purpose of having a repo in the first place (because it makes all specs in to a single project - which is the point of an organization and not a repo) and generally seems to make a mess of things (e.g., people can't file bugs without prefixing their bugs with the spec name - which makes collaboration/coordination hard and annoying).