NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.6k stars 13.76k forks source link

Move from GitHub long-term #41448

Closed lukateras closed 5 years ago

lukateras commented 6 years ago

See:

Maybe it's worth looking into hosting a GitLab instance, as GNOME and FreeDesktop projects do.

edolstra commented 6 years ago

I don't see a reason to switch at this moment. If/when Microsoft starts screwing up GitHub, we can always switch then.

Anton-Latukha commented 6 years ago

Microsoft bought closed source product with closed source company.


Microsoft is Open Source company now.

What are the contrary facts?

I am not a Microsoft fan, one I left work because IT stack was mostly on Microsoft.

But at this point, if they can really be a good buddy, why not.

bbigras commented 6 years ago

Maybe we should never have used Github for our projects since it's closed source and now is just the right time to make things right. One of Gitlab's downside is that most people don't have an account and won't even bother to create one for a PR. Maybe that could change now if enough projects move to it.

Anton-Latukha commented 6 years ago

People, let's wait.

The beautiful thing is, we would wait.

And watch and IT people flip-out because they don't understand that Microsoft would only refresh GitHub AND GitLab development at this point.

And all IT guys flip-out. And move to GitLab. And GitLab development goes through the roof, and it becomes superB over time. I can see people right now start writing code for migration from GitHub to GitLab. 8)

We can sit and watch the GitLab grow (https://www.openhub.net/p/gitlab) and receive all kinds of features and become ready for our migration to it.

bbigras commented 6 years ago

I can see people right now start writing code for migration from GitHub to GitLab

You can already import a repo in 1 click (and it must have been that way for a while).

grahamc commented 6 years ago

Yes, I think the prudent move right now is to wait and see. Moving an entire community is quite costly in terms of time and effort. We should not rush in to this decision, and should take great care in evaluating our options and the future.

People who would like to not use GitHub now due to the ownership are able to send patches over email, or maybe even a link to a patch over IRC.

I'm going to close this issue because I don't think we are likely to take any immediate action. I think mirrors and backups are good and smart -- thank you @volth for setting that up.

lukateras commented 6 years ago

@grahamc I agree with the overall sentiment, but I would like a bit more community feedback/discussion before closing this issue, if possible. Judging by likes on OP, this is not just me concerned by this, and closing issue would significantly decrease its visibility.

Anton-Latukha commented 6 years ago

Ok, now as I look closely to interface, seems GitLab has most features I thought was not present. I personally use GitLab as private stash. And used GNOME one quite a big portion of time in recent days. GNOME hosting seems slow.

And if discussion is not about Microsoft acquiring GitHub, but about migration to GitLab, because better - I am interested.

I am for migrating when it would be solid-prof migration. With issues, and a everything. GitHub in fact have a lack of features, khm... (inline highlighting) khm... (SVGs) khm...

@yegortimoshenko Can you refer to Pros/Cons articles, Slant in the head message, maybe have a small list of useful to us features? Then if after this issue would be reopened, referred to everyone would know the points.

lukateras commented 6 years ago

Initial title did not communicate intent clearly, I'm concerned about moving away from GitHub rather than by specifically going with GitLab.

Regarding that tangent, GitLab does have all the features, and Nix highlighting is better in that it properly highlights key pairs outside of attrsets. Say, GitHub doesn't highlight this properly:

imports = [ <nixpkgs/pkgs/top-level/all-packages.nix> ];

It is unlikely that there would be issues regarding feature set, but sunk cost is obviously huge: there are over 40000 issues and pull requests. GitLab has auto-import that retains interlinking, wikis, etc., that should help, but even then this would be hard to pull off.

The main thing going behind this is that Linux distributions have traditionally preferred libre infrastructure and Nix/NixOS/Nixpkgs is a major exception. This is a chance to make things right in regards to that, and this might help migrate more people to NixOS that are reluctant to do so because of less principled stance on centralization and software freedom.

vcunat commented 6 years ago

No panic yet! Microsoft loves open-source, reportedly, so perhaps they will open-source GitHub code :rainbow:

EDIT: BTW I really like GitLab and we use it at my work all the time (full-time open-source). Back then when Nix* went to GitHub, there was no other option getting close in usability (for free). Now it's mainly about the cost to make such a big change (issues, PRs, Borg, etc.), and I hope Microsoft won't make it worth to pay the price...

Anton-Latukha commented 6 years ago

GitLab announce that they make:

Free "for open source and educational projects, this means unlimited access to current and new features, including Epics, Roadmap, Static Application Security Testing, Container Scanning, and so much more!"

"To apply, send a merge request to add your project to a list of open source projects using GitLab Ultimate and Gold."

Official source

lukateras commented 6 years ago

I've reframed the issue title to only be concerned about long-term decision, and de facto discussion still continues, so reopening just to keep this option on the table.

ryantm commented 6 years ago

Additional discussion here: https://discourse.nixos.org/t/github-was-purchased-by-microsoft/313

Anton-Latukha commented 6 years ago

In time there would be articles of big communities moving to GitLab and their experience, and then we would get a deep insight.

Anton-Latukha commented 6 years ago

Updated upper post with list of features provided.

nightkr commented 6 years ago

@Anton-Latukha Sadly I don't think NixOS would qualify for that, since there are quite a few paid contributors (@edolstra from LogicBlox IIRC, various consultancies such as @zimbatm's, etc).

EDIT: Nevermind, with the new language (https://gitlab.com/gitlab-com/gitlab-ultimate-for-open-source/commit/8bfafd8872ac59b53d22863ba5b500f30be2f726) it should be fine, since there is no "NixOS Gold" or similar.

coretemp commented 6 years ago

I am not sure why GitLab is pushed so much, since there is also https://www.atlassian.com/software/views/open-source-license-request.

If NixOS was a company, I would also suggest to try to reduce dependencies for the code review system, the issue system to the point that GitHub would just become dumb storage, but then comes the question: who is willing to do that? If you want to build all the tooling to make everything redundant, multi-cloud, multi-vendor backed, go ahead, I am sure that everyone would want to use it if it is at least as good as GitHub is currently.

Let's say Microsoft would do the worst thing possible: deletion of all data on purpose. The probability of that happening is smaller than that of an operational error by GitHub staff (GitHub was never designed as a security fortress).

The only effect of that would be more people (including people they might want to hire, because apparently nobody wants to use Windows in a cloud environment) would like them less, which is bad for their stock price.

lukateras commented 6 years ago

I am not sure why GitLab is pushed so much, since there is also https://www.atlassian.com/software/views/open-source-license-request.

Because GitLab has a libre version that we could self-host, and even EE version has all source code available (with JS under a free license), while Atlassian is just as bad as GitHub is in this regard.

For example, if you look at software that comes on NixOS install DVD, overwhelming majority of developers of that software will use GitLab, cgit, Pagure, gitweb, gitolite, etc. I would argue there is a reason for why that is the case, and I think by using GitHub we ironically are somewhat alienated from upstream, the larger Linux community and some of the users.

This is about setting policy.

There are of course other options: Gitea is fully community-driven, @thoughtpolice makes a case for Phabricator. This issue is about moving from GitHub in general, not to GitLab specifically.

See https://github.com/NixOS/nixpkgs/issues/41448#issuecomment-394401494, https://github.com/NixOS/nixpkgs/issues/41448#issuecomment-394427607.

Anton-Latukha commented 6 years ago

Also at this point, it is definite that GitLab would be just a technically superior product onward.

The feature that it is free software open for improving it, it would became same magnitude and influence as Linux, or as Emacs. People and companies would work on it, and develop GitLab more, for themselves. It would become the huge feature-full tool. https://www.openhub.net/p/gitlab

Also Ruby is pretty clean, intuitive, good and simple enough for all people to understand most of the processes, and make in it, at least, something.

Last release of Ruby received JIT possibility. WebAssembly. There is a full pack of tech to make Ruby jump fast enough as never before. Now I always sing praises to Ruby.

Anton-Latukha commented 6 years ago

Linux work happens over clean Git, and mail list. Emacs through Git and old Savannah.

Besides GIT - they are developed on legacy workflows. Which slows development or lessens volume of contributions/contributors.

And since development processes of GitLab are using GitLab itself - it makes productivity progression correspond to a power function. f(x) = cx^n

coretemp commented 6 years ago

If you say GitLab's import works for every single GitHub's feature, prove it (this is more difficult than just pressing the import button, but you seem to think that it isn't). GitLab's marketing people can say everything.

Your other assertion is that the current proprietary tools might stop potential contributors. If you can find 5 people who declare this openly on their website specific to the NixOS project, then this might be an issue, but otherwise it is statistical noise. Certainly not a convincing argument to do a major migration.

Only the owner of this repository (or the future one) can set policy. All we can do is provide convincing arguments, which I'd hope the owner of this repository can also think of.

By the time the current owner is doing things so terribly bad, someone will fork, so if it's about policy, I think the right policy is to do nothing at all or to wait until someone does all the work to build something that is so convincingly superior that there is no need to discuss anything.

In general, that's a thing that I see a lot in open-source projects: people want to discuss a lot.

Regarding the pulseaudio policy. What should be the policy for technical decisions? Are we a democracy? Or should we appoint a committee to allow them to make decisions? I think the arguments against pulseaudio are convincing. I think it's great that user shared knowledge on how broken pulseaudio is; something I would have never known if I hadn't used NixOS. Clearly, that person has a superior understanding of PulseAudio compared to me, and probably every other user in our community.

Why should we not listen to such a person? Is it just because "all the other distributions are not doing it"?

I wonder how a vote amongst people that actually know something about the subject would turn out. I read libcardiacarrest's source code. I am going to guess that 99.5% of NixOS users didn't.

People use GitHub, because it works and people like to use it. That's why companies pay money for it.

Like I said before, if you want to do a migration, just fork all the infrastructure with whatever group likes your tool of choice and see where people open most of the issues, post most of the patches, etc. If your suggestion is really better, then we should expect more people to use your solution to the point that GitHub becomes a deserted wasteland. At that point one might make it official, but until that day this is little more than an idea.

GitLab will never accept patches in the community edition adding SAML support or other enterprise features like more robustness (i.e. all the things that actually make a product valuable). As such, all you are doing is providing free labor to make their enterprise product worth more by squashing bugs in the core of the product. You also seem to be under the impression that there is not going to be any difference between a hosted version of GitLab enterprise by them and something you could run on your own servers; this is a rather naive thought and nobody from GitLab is going to say that they won't have additional proprietary patches running on their systems.

If someone would fork GitLab implementing the missing enterprise features, GitLab's revenue would go to zero the moment Amazon thinks the quality of this hypothetical GitLab fork would be high enough. GitLab will just have the same problem as GitHub (lack of VC funds) in a few years with the difference that nobody is going to buy them, because there won't be a second GitHub exit.

If you really think GitLab is a superior product, then I'd think you haven't used GitLab for an extended period of time (I have).

This thread reminds me a lot of the mailing list vs Discourse issue.

lukateras commented 6 years ago

If you say GitLab's import works for every single GitHub's feature, prove it (this is more difficult than just pressing the import button, but you seem to think that it isn't). GitLab's marketing people can say everything.

I have done a test import of a medium-sized community project (2K+ issues, 5300 commits) over to GitLab recently, and it worked really well. Users, issues, pull requests, interlinking, etc. were all imported properly. That only involved pressing the button.

Your other assertion is that the current proprietary tools might stop potential contributors. If you can find 5 people who declare this openly on their website specific to the NixOS project, then this might be an issue, but otherwise it is statistical noise. Certainly not a convincing argument to do a major migration.

These people are not around because we scared them away. Look at Debian community and you'll find plenty not comfortable with hosting infrastructure on GitHub. Also, as a hypothetical, imagine us using Discord instead of IRC and what would be the effect of that.

I wonder how a vote amongst people that actually know something about the subject would turn out. I read libcardiacarrest's source code. I am going to guess that 99.5% of NixOS users didn't.

It's just an API stub, there's nothing to read. I know for a fact that most people that participated in that thread have done great work in Nixpkgs (including OP) and I wouldn't assume they don't understand the issue or don't understand what libcardiacarrest does.

Like I said before, if you want to do a migration, just fork all the infrastructure with whatever group likes your tool of choice and see where people open most of the issues, post most of the patches, etc.

This is not how collaboration and community projects work.

GitLab will never accept patches in the community edition adding SAML support or other enterprise features like more robustness (i.e. all the things that actually make a product valuable).

Demonstrably false: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests?label_name%5B%5D=Community+Contribution&scope=all&sort=popularity&state=merged


This also seems to get off-topic relatively quickly, with PulseAudio/libcardiacarrest opinions, telling me to fork instead of seeking community decision here, etc. Let's continue in Discourse thread instead? :-)

Anton-Latukha commented 6 years ago

@coretemp you are too serious.


Off topic.

ALSA...

What a fun a complex ALSA setup, overcome ALSA, soundcard bugs, learning ALSA configuration language and all it's quirks, buggillion special files, abstractions and identifiers, when all that somehow does not allows you to do what you need. A couple of mixers, couple of audio devices I/O.

ALSA fully embraces racing condition chaos. Did you in good-old-days ever suddenly had audio appeared on your video card, or on PA system, and not on headphones? Some process sound stream took-up full audio device? And every boot your audio hardware system is a material of the theory of combinatorics, and you configuration works differently every boot. Every sound device has 3-5 identifiers. And ALSA does not allow you to use that identifiers in the config. In ALSA it is not possible to tie a config to particular sound card, by design. You need to adjust device-type priority to somehow align initialization boot of cards (and they have a random boot timeout), so they would maybe receive needed IDs consecutively. Maybe discard some modules, and load them afterward the startup. And ALSA sometimes does not respect that prioritization due to different factors. And now try to work on a setup with two similar sound cards...

For professional work there is JACK. Since it works above ALSA - JACK also has some part of ALSA fun.

PulseAudio is a great simple abstraction, mixer and networking layer (and no-one never somehow leverages last) and it makes life simple for you. It works in standard cases. It tries to overcome ALSA entropy.


And we talk here, and wait on articles and experiences of other communities.

Dolstra already said what we do currently.

Yes, Discource.

puffnfresh commented 6 years ago

As an employee of Atlassian and part of a team which heavily uses Nix and nixpkgs, I'd be very happy to facilitate interaction with the Bitbucket team. If there's anything I can help with, please just let me know.

oxij commented 6 years ago

On the topic, I'd vote for plain simple Mailman mailing list (a-la Linux) with bots that do patch tracking (because Mailman mailing lists are trivial to dump, you can just download archives in a somewhat censored mbox format), but that seems to be out of fashion now. Phabricator discussed on discourse looks interesting, especially since it seems to integrate well with MLs, but I have not tried it, and it's written in PHP, which is ewww~. To be honest, anything that can be dumped to RFC822 and reliably (GitHub is meh at this, see below) interacted via email works for me. ML with some fashionable HTTP interface on top a-la Mailman v3 and with ML threads categorized into "issues", "bug reports", "PRs" seems like an ideal thing from a KISS-technical-standpoint to me.

But, of course, we can't use SMTP and plain-text files with checklists in the repo itself for something so crucial as project management today, today we have to do everything over HTTP with Responsive CSS and Reactive JavaScript UI running over RESTful JSON RPC that is built on top of Sometimes Eventually Consistent no-SQL Database running on somebody else's Clound SAAS Infrastructure written in four different programming languages, and Ruby, and Node.js, or else nobody would take us seriously. Also, as a serious software project we have to have same discussions in four separate places on the net (for participation!) and also have a separate RFC repo where we collaboratively write documents that meta-discuss some very interesting discussions we can't have anywhere else, we are a mature project, which means we are made of not just software, but also of Protocols and Conventions, which is why we need an RFC process like all serious protocol designers have, doing it like the old farts with a bunch per-subsystem MLs (which don't drown in unrelated discussions) is completely unacceptable, we have to lead the way in solving the old problems with the new tools, we can't just expect our developers to have a working MUA.

Btw, is there a tool with which I could dump GitHub issues to RFC822? I'd like to have an index of everything and I'm missing a couple thousand issues because I subscribed to the whole repository a bit late, and some unknown number of messages from other issues because GitHub regularly loses notifications (which I noticed only recently by comparing HTTP UI with what my mailserver gets) (surely, a consequence of the CAP theorem).

On audio things, I wonder how PulseAudio got into this thread, but I can't stop myself from commenting, sorry, feel free to skip the rest of the message :)

@coretemp

Cheers! Somebody not from SLNOS actually read the source of libcardiacarrest and understood the argument! I can die happy now.

@yegortimoshenko

It's just an API stub, there's nothing to read.

That's not entirely correct, there are about 100 LOC of code there that are not stubs. Also a bunch of snarky comments about the original implementation of libpulse.

@Anton-Latukha

You also probably never done a complex ALSA setup, overcome ALSA, soundcard bugs, learning their configuration language and it's quirks, buggillion special abstractions and identifiers, when all that somehow does not allows you to do what you need.

Can you be more specific, please? In the last seven or so years I have yet to see a single machine on which ALSA didn't work out of the box with an empty asound.conf. All the problems with "ALSA" I debugged for others were problems with some crazy shit in asound.conf, the irony is that 33% of the time that shit was written there not by the user, but by previously installed PulseAudio package. Removing/renaming that asound.conf/.asoundrc followed by running alsamixer to unmute all channels usually magically fixes everything.

I do "pro-audio" (mostly tinkering in LMMS) on pure ALSA for years without any issues. FYI, I can run jackd + LMMS fine, but it's useless with LMMS since LMMS can chain zynaddsubfx/other things and effects in a single process without requiring any sound daemons whatsoever, which means on ALSA + LMMS I can do ~20 tracks at the same time without bottlenecking the CPU, with jackd + LMMS ~10-15 will bottleneck and start to lag, last time I tried using PulseAudio it added enough latency to the output to make MIDI input feel really wonky when playing nothing else except that MIDI input alone. A high-quality software.

embraces racing condition chaos

Do you mean device numbering? Networking devices under Linux have the same problem (which is solved by either introducing persistent udev rules based on MACs, or by systemd device naming based on PCI addresses). Linux does partial ordering for device initialization to have faster startup, don't blame ALSA for PCI infra in Linux and non-determinism in your chipset. There's a plethora of options for configuring ALSA defaults for non-deterministic systems, grep "defaults" in https://wiki.archlinux.org/index.php/Advanced_Linux_Sound_Architecture

I imagine on non-deterministic system you had to configure your default in PulseAudio too. Writing something like (aplay -l gives the list of cards and their names)

defaults.pcm.!card "PCH"
defaults.ctl.!card "PCH"

to ~/asoundrc doesn't feel much more complicated to me than clicking around PA controls.

Some process sound stream taking-up audio device?

Are you really sure it's not actually PulseAudio or jackd doing it? No other software I know of ever blocks hw pcms and ALSA does dmix for all other pcms out of the box by default for over a decade.

Seriously, all the sound problems can usually be solved by doing systemctl stop pulseaudio ; mv -i /etc/asound.conf /etc/asound.conf_ ; alsamixer, unmute everything (M key), set volumes (Up/Down arrows), exit with ESC. Assuming you reverted #35355 that broke ALSA by default or have sound.enable = true explicitly set in your configuration.nix you are now done, everything should just work.

Ekleog commented 6 years ago

(Let's please keep PulseAudio out of the discussion, it's completely unrelated to it)

@oxij https://developer.github.com/changes/2018-05-24-user-migration-api/

I don't think it's accessible from outside the NixOS organization, but anyway I do think it would be best that someone from the NixOS team runs it in a cronjob for backup, and why not put it somewhere on nixos.org for download for people who'd want to script stuff on it.

It's not RFC5322, it's still something that should be script-able: JSON. (And pre-emptive opinion: I think a JSON vs. email debate is just as irrelevant to this thread as is the PulseAudio vs libcardiacarrest debate :p)

7c6f434c commented 6 years ago

Somehow nobody has mentioned these points, and I think they are relevant:

  1. GitHub seems to be somewhat usable in the read-only mode without scripts, in text-mode browsers, fetchable by spiders, whatever. In GitLab random things (like looking at a file in the repository) require a reasonably fully featured browser with scripts enabled.

1a. Of course, it is hard to predict what happens in a year and what are the plans in MS.

  1. «Worst that can happen» is not deliberate deletion of data. I think it is possible to perform LinkedIn integration of GitHub in a way that would make sudden deletion of all our repositories on GitHub look like a better outcome. Or, see SourceForge before the last ownership change (it got better with new owners, I know) — that is hopefully less likely because of .

  2. It is not likely that any integrated solution completely satisfies our access control wishes. And maybe filtering wishes with respect to issues. I mean, all generic access control is too straightforward, and we constantly want people to be able to manage only a subset of issue labels or something like that.

That means that bots, be it ofborg or something else, are very likely to be a significant part of infrastructure. An ability to put a reasonable mock-up of the hosting into the tests will probably help with the bot development.

  1. Of course, if we really don't want to lose some information, it would be useful if copies could be made easily and also could be useful. I guess if we used something like Bugs Everywhere (just an example chosen not to look like picking a side among other similar solutions) it would be really hard to lose issue comments during a repository migration.
coretemp commented 6 years ago

@yegortimoshenko You put a link to all the patches to the community edition, which even seem to require copyright assignment (i.e. free labor). In no way did you respond to my criticism. GitLab is a VC backed initiative, which could not be further from what e.g. RMS with Emacs represents politically. The only impact NixOS moving to GitLab would have is an investor opening a bottle of champagne, because NixOS is one of the larger projects on GitHub (see https://octoverse.github.com and search for NixOS).

Regarding libcardiacarrest, you have clearly identified yourself as someone who has not read it (as also highlighted by @oxij ). This kind of talking lowers the confidence others might have in your other statements.

@oxij :100:

@Anton-Latukha Actually, I have done all the things you mentioned and more.

coretemp commented 6 years ago

@7c6f434c Yes, I didn't want to suggest even worse things myself, but I had thought of even more diabolical things one could do.

All good points otherwise, of which some further show that GitLab also has various downsides. Regarding access control, I have no doubt that one of the first things Microsoft will do is implement better access control to repositories, because:

I predict that in 5 years GitHub, GitLab and anyone who even remotely operates in this market will have such access controls implemented. This kind of bot development is just a workaround for a product that doesn't meet market demand (only talking about the access controls here).

7c6f434c commented 6 years ago

@7c6f434c Yes, I didn't want to suggest even worse things myself, but I had thought of even more diabolical things one could do.

LinkedIn spam is what MS currently owns and doesn't stop. So I think it is fair game to mention as a «conceivable problem». I hope it is not a likely problem.

All good points otherwise, of which some further show that GitLab also has various downsides.

Everything has downsides, and everything has downsides that annoy me personally, that's for sure.

Regarding access control, I have no doubt that one of the first things Microsoft will do is implement better access control to repositories, because:

  • there is a clear demand (even open-source projects like NixOS want it)
  • MS caters to the enterprise who certainly want this feature

Better is not the same as good enough.

I predict that in 5 years GitHub, GitLab and anyone who even remotely operates in this market will have such access controls implemented. This kind of bot development is just a workaround for a product that doesn't meet market demand (only talking about the access controls here).

Well, bots could work around a lot of requirement mismatches: we have unique CI desires, every project has unique ideas about access control, ofborg auto-caculates some metadata…

For access control: This problem might be too hard.

It is a feasible (not that hard) task in the «when in doubt, hardcode a new type of checks, refactor later». I am not sure a generic solution will not end up Turing-complete anyway.

My impression from looking at how people describe access control in various systems (I was looking for notions to steal for a refactoring of an in-house system) is that it is currently a problem not solved in general, only in special cases.

And this is a problem more than 5 years old. I do not expect that Microsoft will solve it weel enough for Nixpkgs, because we are exactly in the state where our demands are weird. It is likely that not enough enterprises want the least experience employees to categorise the workload…

You are probably right that in 5 years GitHub will have power (with respect to issue tracking) similar to a well configured Redmine instance (which can be had today if that's what we want, but I think in Redmine pull request integration might be weaker without using a specific solution for reviews which is well supported by some plugin, and I don't know the details of the situation). I am not sure we don't want more flexibility.

oxij commented 6 years ago

@Ekleog

@oxij https://developer.github.com/changes/2018-05-24-user-migration-api/

I don't think it's accessible from outside the NixOS organization, but anyway I do think it would be best that someone from the NixOS team runs it in a cronjob for backup, and why not put it somewhere on nixos.org for download for people who'd want to script stuff on it.

Thanks! That's pretty awesome to know.

Can somebody with master access (only @edolstra?) to NixOS org dump and share the tar produced by https://developer.github.com/changes/2018-05-24-user-migration-api/ ? That would at the very least alleviate "what if they will delete everything" concerns and I'd like to look at that data and see what it would take to write a thing to convert those JSONs to RFC822.

If people insist on having discussions in four different places I'd really like us to have at least a single place to store and distribute via rsync/git/tarballs:

in RFC822 and (optionally) JSON.

Imagine such an archive indexed with Xapian and a search bar directly on nixos.org. I would use it (well, a local version of that) constantly and plaster "USE NIXOS.ORG SEARCH BAR BEFORE DOING ANY CODING" all other the place. I know that because I actually already use something like that daily, I have maybe 80% of all the stuff archived and indexed by notmuch at the moment and half of the time I want to do something there's already

Every time I see duplication of work in NixOS PRs I want to scream "it was done before, just use search!", but then I realize that there's no official archive people could import into their notmuch/mu and we are also not supposed to recommend doing that, because that's too hard for developers, everything needs to have an HTTP interface, for Participation!

Also note that most of FLOSS history anybody knows and reads about (GNU history, Linux history, OS wars, etc etc) comes from USENET and ML archives. Our old ML already perished (I always meant to ask why? Because @edolstra moved out of the university?), Google Groups will perish, Discourse will perish, GitHub will perish, the WWW/HTML (but probably not HTTP) will perish (roughly in that order) (and are gonna be replaced by Freenet-style distribution of Xanadu-style Hypertext documents in EDL-style format over GNUnet-style protocols). Meanwhile, archives in RFC822 format will be available and usable by our grand-grand-grand-grand-grandchildren (and, later, gigantic sentient cockroaches).

Let's make them! (To inform our grand*-grandchildren and please cockroach archaeologists.)

oxij commented 6 years ago

@7c6f434c

  1. GitHub seems to be somewhat usable in the read-only mode without scripts, in text-mode browsers, fetchable by spiders, whatever. ... 1a. ...

Actually, that's a very good point.

Also note that by switching from a proper ML to Google Groups NixOS lost occasional anonymous contributors. Not completely, as you can see people occasionally do PRs from clearly oneshot accounts. But the bar was much lower before. Everything requires registration now and NixOS has no official address that accepts submissions via remailers and such (which are the only truly anonymous communication channel, btw). With an old ML such messages simply struck premoderation, were let in into the ML manually, and applied by interested parties.

Not to blame anyone on that point as "the computer industry is the only industry that is more fashion-driven than women's fashion" (RMS), but by pursuing "Participation!" we actually lost anonymous participation.

Can NixOS have a proper ML, please? Or, better, at least two MLs: #core+lib+stdenv+treewide-changes, #packages+services. Those clearly are different beasts, I want to read all of the first one, but almost none of the second one.

I'm ok with #users being a discourse or whatever else if it gets archived.

nightkr commented 6 years ago

Or, better, at least two MLs: #core+lib+stdenv+treewide-changes, #packages+services. Those clearly are different beasts, I want to read all of the first one, but almost none of the second one.

So... Discourse categories? You can change subscription settings for them individually, and you don't force everyone into the same dichotomy. For example, even light users will probably want to be aware of major treewide changes. That said, those categories do seem a lot better than @zimbatm's catch-all #nixpkgs category.

oxij commented 6 years ago

@teozkr

Archiving is a first priority, IMO. Everything else can be done later. How do I download archives from Discourse?

So... Discourse categories?

You see, I want to receive everything (for indexing), but at the same time have a header that can be used for filtering. Mailman has "List:" header that serves exactly such a purpose. GitHub sends some useful stuff in the headers but not nearly enough for the rate of PRs nixpkgs repo has. Discourse, as far as I can see, sends absolutely nothing useful in the message headers.

Another very useful thing about MLs is that you automatically receive author's address in the "From:" header. GitHub and Discourse mangle those to make you use their server for what could be a private discussion between two parties, in ML land people frequently fork off off-ML discussion seamlessly without asking anyone. That can be emulated with GitHub by grepping the repo for the email of the other party (which is meh, but better than nothing). AFAIK, can't be done with Discourse.

Another somewhat less frequently useful thing about MLs is that you can sign messages. There's a talk on importing old ML into Discourse, but I hope you realize that you wouldn't be able to import messages and keep the signatures valid unless Discourse can store and serve RFC822 messages directly (which it can't). Nor can you sign Discourse messages (except, maybe, in armored ASCII, which is unreadable and will make other people angry if you are to use it frequently enough).

Finally, you'd need to somehow import PRs with patches into Discourse to make categories from my previous message useful. Meanwhile, GitHub PRs look like ML threads produced by git-format-patch with a bunch of bling to me, so I expect it should be pretty easy to convert GitHub's JSONs into RFC822+git-format-patch for posterity and ease of filtering. I don't think you can do that with Discourse (because it is not a patch tracker).

In short, the more I think about this, the stronger my love of MLs gets. When NixOS was developed via old Mailman ML things were easy to follow (with some mail filters) and the whole thing could have been made manageable by splitting that ML into two-three MLs, adding some maintainer automation and a patch-tracking bot a-la OfBorg. Instead, for the short-term gain of apparent simplicity (and Participation!) of GitHub we now pay with 3000 open PRs and issues, no public archive, and no way to filter all of this mess except by writing some more one-use software that can convert this mess back into RFC822 for archival purposes and filtering (which would also now require some unimaginable amounts of manual tagging).

I find it really surprising anyone still wants to continue using all these barely working things (be it GitHub, Google Groups, or Discourse) when there's Mailman 3 which is a proper ML like Mailman 2 was but with an optional bling-bling HTTP interface for Participation! and Engagement!.

coretemp commented 6 years ago

I find it really surprising anyone still wants to continue using all these barely working things (be it GitHub, Google Groups, or Discourse) when there's Mailman 3 which is a proper ML like Mailman 2 was but with an optional bling-bling HTTP interface for Participation! and Engagement!.

@oxij The reason is quite simply that in any technical community only a small percentage actually would be considered experienced. In short, the average NixOS developer doesn't know any better. You are underestimating the amount of knowledge you have and additionally it requires a certain level of intelligence to actually make use of those tools effectively. Expecting that everyone in this community can do that is not realistic (the proof is that we are using those inferior tools as you so eloquently have explained). (This comment would be a good example of what would go via an off-list message.)

If selection of communication tooling would have been done better, actual needs would have been collected first, and candidate systems would have been eliminated based on some of the attributes you mentioned. Instead it turned out in some kind of democracy of ignorance; those who have most time to scream influence the decision the most, as opposed to those competent.

The appropriate response to:

we gracefully received a free discourse instance: https://nixos.trydiscourse.com/ . I propose that we try this out as a replacement for the mailing-list. The goal is to encourage even further discussion by lowering the barriers to discussion. If we are happy with it, move it to discourse.nixos.org .

should have been. "Do you have a detailed set of metrics on which to evaluate this?" and then nothing should have been said on the topic anymore until such metrics with data would have been provided.

That discussion is a gold mine for the science of decision making and ideally everyone should also be interviewed to state why they said the things they did, because it's an excellent example of how not to do things.

https://nixos.wiki/wiki/Nix_Community states the various people that are involved, but nowhere is it said how decisions are supposed to be reached when there is no consensus. RMS recently essentially asserted control over some project. Should we expect the chairman of the NixOS Foundation to also do that?

7c6f434c commented 6 years ago

You see, I want to receive everything (for indexing), but at the same time have a header that can be used for filtering. Mailman has "List:" header that serves exactly such a purpose. GitHub sends some useful stuff in the headers but not nearly enough for the rate of PRs nixpkgs repo has. Discourse, as far as I can see, sends absolutely nothing useful in the message headers.

As you already have your own mail server, just multi-subscribe.

(I do agree that having actual mail headers is better in principle)

Instead, for the short-term gain of apparent simplicity (and Participation!) of GitHub we now pay with 3000 open PRs and issues, no public archive, and no way to filter all of this mess except by writing some more one-use software that can convert this mess back into RFC822 for archival purposes and filtering (which would also now require some unimaginable amounts of manual tagging).

Well, let's not downplay «having a git repository more often up than not and not thinking about keeping an eye on it».

7c6f434c commented 6 years ago

In short, the average NixOS developer doesn't know any better. You are underestimating the amount of knowledge you have and additionally it requires a certain level of intelligence to actually make use of those tools effectively.

Of course, finding anything in the Github mess by now might require even more knwoledge…

One would think that Nix community should be pre-filtered for ability to read a long document describing a weird workflow, though.

Instead it turned out in some kind of democracy of ignorance; those who have most time to scream influence the decision the most, as opposed to those competent.

Naturally, in the choice of communication platform in case of a split the most lively platform will be the one chosen by people writing the most…

In general, decision making in Nix* might sometimes work less predictably (which doesn't always — in my opinion — lead to any better outcomes).

That discussion is a gold mine for the science of decision making and ideally everyone should also be interviewed to state why they said the things they did, because it's an excellent example of how not to do things.

That's a risky topic, because studying decision-making in Nix* might turn out more interesting than actually maintaining Nixpkgs, with unfortunate implications.

Should we expect the chairman of the NixOS Foundation to also do that?

We have a person who could do that, and he sometimes does (on various topics). Unfortunately, there are too many questions where people try to get the opinion of the same person, and he doesn't always have time to explain the position, let alone actually answer requests for clarifications (and there are signs there is not always enough time to look into the details beforehand).

oxij commented 6 years ago

@coretemp

... experience ... intelligence ...

I think intelligence is generally overrated, experience is the key.

You see, I find it strange that we expect our users to

We also expect our contributors to know enough git to rebase between releases (see top README in the repo).

But when it comes to project management we suddenly assume that we have to dumb it down because our developers can't use anything except a web browser in the most default config. I find it ridiculous. Sure, it is highly probable most of them didn't use anything else, but I'm sure if they learned git-rebase they can handle git-format-patch.

we gracefully received a free discourse instance: https://nixos.trydiscourse.com/.

As I said above, I have no claims about using Discourse for #users ML. I have a problem with it being used for development. #users and #nixos IRC channel are support (as are StackOverflow, commercial support, phone support, etc etc). When people donate their time to answer questions for free I can only respect that and I think they have a right to use any tool they want for it. But I think they themselves would agree that it would be prudent to do support in a way that can be archived and indexed for deduplication of efforts, which would be nice to have of Discourse.

developer tools

To be more precise about what I have in mind, I highly recommend looking at the system notmuch uses for bug and patch tracking https://nmbug.notmuchmail.org/status/. That whole page is just a set of filters over notmuch ML maintained by a simple tagging system of https://notmuchmail.org/ itself and a companion script https://notmuchmail.org/nmbug/ for collaborative tag editing and syncing via git.

I.e. Mailman ML + git repo with (MessageID -> [tag]) mapping + nmbug to make that repo easy to maintain + notmuch for actual filtering + some HTTP bling for the web page.

It feels like this is what @7c6f434c wants tagging system on GitHub to be. You can have it today. Well, I'm sure NixOS scale will need some tweaks to the HTTP bling (e.g. paging :]), but notmuch itself can handle millions of messages without a hitch.

government

I heard (and like) a theory that all arguments in software development are either

Humanity already invented the tools to solve both: elected officials and courts.

I think having elections would be a good start. If we had that I would immediately bring up the

up to an informed vote.

We have a person who could do that, and he sometimes does (on various topics). Unfortunately, there are too many questions where people try to get the opinion of the same person, and he doesn't always have time to explain the position, let alone actually answer requests for clarifications (and there are signs there is not always enough time to look into the details beforehand).

Which is why we need elections and courts.

I recently started tagging nixpkgs PRs that were killed with "this needs to be an RFC" and similar. Sure, something like an RFCs can be useful to make a well-researched argument, but I'm sure I'm not the only one who feels that "make an RFC" for a three-line change in the core is a kind of "go away, I don't want to think about this right now", especially since https://github.com/NixOS/rfcs/tree/master/rfcs has no accepted technically substantial RFCs. Filtering indexed GitHub with 'from:"that person" and "RFC"' in notmuch is quite revealing.

Should we make an RFC for elected government? :)

Anton-Latukha commented 6 years ago

Go to Discource.

7c6f434c commented 6 years ago

@Anton-Latukha funny timing — soon after the claim that Discourse mangles mail even worse than Github, and right after the claim that discussion-channel redirection has yet to result in anything productive…

@oxij

But I think they themselves would agree that it would be prudent to do support in a way that can be archived and indexed for deduplication of efforts

The «support» part seems to be a good catch — the speed of relevance decay is much higher, and deep-linking a section of the proper Wiki page is often a good enough deduplication solution, if someone is in mood to search and not just answer from memory.

bhipple commented 6 years ago

@oxij @7c6f434c I'm very interested in getting setup with mail in notmuch + emacs, mailman3, and so on, with proper localhost indexing and search, but every time I start looking at it I'm slightly worried about the upfront costs of doing so and (previously) haven't been entirely sure it's a large enough upgrade over the browser to justify the investment. You've managed to convince me; are there any great "getting started" blog posts / videos / tutorials for beginners that you'd recommend [1], or should I just dive into the man pages?

[1] When I was first learning emacs and org-mode, I found these types of resources incredibly helpful, since the manuals tend to just say what you can do with the tool, while the tutorials tend to give an opinionated preview of what the author personally finds useful and valuable in the tool.

7c6f434c commented 6 years ago

For the record:

0) I wasn't really arguing that for the mails as currently sent local indexing provides that large of a boost. Local archiving has some plusses, but Nix* project is probably in the state close to the design target of git: we don't keep history records well, just because we cannot afford to look at historical records anyway (given the rate of flow).

1) I don't use Emacs.

2) I do use a DB-to-FS mapping solution backed by PostgreSQL that allows me to add arbitrary SQL queries easily and have them mapped to a virtual FS (FUSE), I have comfortable bindings to interact with all that via Vim, I also use it for email… and I don't actually bother to set up a reasonable tagging system. Different inboxes (~ per-project/per-facet), filter by sending, a bit of grep over contents — and that's all what I use. At least around Nix, for some things I also manually assign tags and batches. @oxij apparently keeps better records.

oxij commented 6 years ago

Mailman is server side, so you probably don't want to touch it unless you plan to host a ML.

notmuch tool itself is a no-commitment thing (but a gateway drug), it is just an indexer for mail in Maildir format. Download (if you haven't yet) your mail into a Maildir (with fetchmail, offlineimap, etc), point notmuch to it (run notmuch in the shell and it will guide you through configuring itself), run notmuch new, wait a while for it to index it (the whole of LKML can take overnight, I have ~0.5M mails in my primary Maildir and it takes a couple hours or so on my old laptop). After everything is indexed you can search stuff with notmuch search. That's it.

It will never edit your RFC822 files on disk. With maildir.synchronize_flags disabled notmuch treats Maildir as a completely read-only thing. With maildir.synchronize_flags enabled it will sometimes rename files. If you or some tool later renames files in your Maildir notmuch new will notice it (using inodes and Message-ID: header) and will do all the right things with its db.

Emacs notmuch-mode interface is just a very KISS MUA on top of notmuch search, you don't have to use it, you can keep using your old MUA (e.g. mutt, Thunderbird, Sylpheed, Claws) and use notmuch only for search (a bit cumbersome without Emacs) and filtering into different sub-Maildirs based on tags (totally usable).

However, if you are going to use other MUAs and scripts that move mail between sub-Maildirs at the same time (like I did, see below) I highly recommend doing regular backups (or just put your whole Maildir into git). I f*cked things up a couple of times before while developing my meta-filtering thing that moved stuff around mentioned below. Daily backups and git (I do both, I'm paranoid) saved my ass, I fixed my tool, and now I don't even move anything anymore because I can (see below).

My path to notmuch was (more or less, simplified)

get born, have no email -> ... wait awhile -> start using Outlook Express -> switch to Linux in middle school, turns out Outlook stores everything in some non-standard format, -> bring up local courier IMAP daemon and move everything from Outlook into that over IMAP (I don't remember if I came up with this myself or I read about it somewhere, but this way everything magically became almost-RFC822 on the server) -> fix some non-standard shit Outlook does to RFC822 with a bit of python, sed, and some manual editing, -> start using Thunderbird with a localhost courier daemon -> ... be happy for awhile, eventually Thunderbird becomes unusable with the amount of mail I store and search -> switch to Sylpheed -> ... be happy for a while, -> write ~150 filter rules for Sylpheed, -> ... that gets unmanageable, research what can be done, -> try notmuch, got blown away, -> reimplement filters using notmuch and a script that moves mail between sub-Maildirs (stolen from notmuch ML) based on tags -> use Sylpheed for receiving, reading, composing and sending mail, notmuch just for filtering into Maildirs and occasional search, -> learn Emacs a bit by bit (mostly for organizing myself in org-mode), -> Emacs notmuch-mode looks more and more interesting since, for instance, org-mode can cross-reference mails in notmuch, but I use a very different workflow form notmuch devs, so it needs configuring, and I'm feeling lazy, -> ... wait awhile -> Sylpheed's IMAP bugs out and eats some of my mail, I flip the table, switch to using fetchmail for receiving, msmpt for sending, configure notmuch-mode for my preferences and learn how to use Emacs message-mode for composing, -> realize I don't actually need the local Postfix and courier daemons for using this setup as I can just configure fetchmail to use maildrop MDA (much more KISS and reliable than postdrop, btw) directly with something like

set invisible  # don't mangle headers, this way refetching mails several times produces exactly identical files which means stuff is easy to dedupe
set softbounce # don't ever generate bounce messages

defaults
  mda maildrop # don't submit to local SMTP, call MDA directly

poll ...
 keep # at least before you become sure everything is ok

in the config, which also makes fetchmail results reproducible -> ... be happy for awhile -> write 2K filter rules for notmuch -> ... that gets unmanageable -> implement rule generator for notmuch in python (that took a while, like a month of work in total over several years, I mean to publish it for years too, but there's always something preventing me, soon, soon) -> write ~600 LOC well-commented meta-rule file that get compiled into 55K notmuch tagging rules that get toposorted and executed in the right order (and the whole thing usually takes less than 10 seconds, magic!) -> ... be happy for awhile -> disable maildir.synchronize_flags and mail filters that move mail, I don't use anything except notmuch-mode anymore so everything can just pile into a single inbox Maildir which I rename manually sometimes simply not to hit FS limit on a number of files in a single directory :) (and a separate spam folder that I do wipe from time to time), now my Maildir is pretty-much append-only -> implement patch tracking directly in my meta-thing, realize that GitHub loses notifications sometimes -> ... be somewhat unhappy about that for a while -> date=now

So, my mailserver runs Postfix, I fetch mail with fetchmail from my and a bunch of non-mine servers directly into maildrop that drops mail into my Maildir (no local mailservers involved), notmuch indexes it, Emacs notmuch-mode renders notmuch searches, I compose using Emacs message-mode, which is configured to use msmtpq (msmpt with a queue a-la Postfix, see msmtp repo) as sendmail for submission back to the servers.

I spend 5 hours per year on average editing my meta-rules (and that time shrinks year by year, as clocked by org-mode), mail filtering is mostly free, I very infrequently ever have to tag something manually (because my meta-thing has Bayesian filters builtin), I see spam in my inbox maybe once in couple of months (and my mailserver only does graylisting and adds Authentication-Results: headers, everything else is done locally with my meta-thing).

Never going back.

In sum, learning Emacs org-mode and notmuch certainly were the two most productive things I did in my life, then learning Haskell, or Python, can't decide. I seriously think org-mode, email architecture (MTA, MDA, MUA, RFC822, MIME, PGPMIME, SMTP), use of text indexing tools and Haskell/Python need to be taught in elementary schools.

As a side note, I'm mildly excited about Mailpile project since it's, basically, a reimplementation of notmuch in python with HTTP bling. Which would be an awesome gateway drug for regular people who don't live in Emacs and don't need 55K filtering rules.

edolstra commented 6 years ago

Slight aside: brevity is a virtue. Please don't use this issue to filibuster about mailing lists, Discourse, Pulseaudio, mail configurations, project governance etc., since probably not many people have the time to read through all of that. I've hidden some of the more off-topic comments.

oxij commented 6 years ago

@edolstra May I ask you to backup the repo with the https://developer.github.com/changes/2018-05-24-user-migration-api/ thing as discussed above and share the result, please?

Please don't use this issue to filibuster about ...

The current messy state of PRs and issues of this repository is the direct result of not using things we filibuster about here and the lack of governance, is it not?

It is also a weekend and I felt like dumping my frustration about said things. I don't think anything except a couple of last messages specifically about notmuch and half the message about PulseAudio (and I marked that specifically) are off-topic. The rate of dumping was surprising even for myself, though.

But, again, as noted in "the off-topic" messages, normally those things would be forked away into separate threads, but this is GitHub, not ML, which, of course, is the fact that is "off-topic" itself.

I've hidden some of the more off-topic comments.

Funnily enough, that made most those messages completely unreadable in the GitHub UI, as if to prove my "off-topic" points.

Anton-Latukha commented 6 years ago

@oxij

the lack of governance

  1. Is what is needed. As Linus said, evolution of Kernel happens naturally, without any plan. And this is the only living model so far. Even neural networks evolve.
  1. I don't know about you, but my commits go through a great review process, and @jtojnar mentored me a lot.

  2. Number of PRs is stable (~660), so maintainers are keeping up the great work.

  3. GitHub could had even better bugtracking. But even now, issues have a slow, but healthy net gain.

  4. And yes, standardized full backup, and Free Software license - this would be big benefit.

  5. The less rules [RFCs], the more Wu wei the process is by itself - the better. NixPkgs are very Wu wei-able codebase.

coretemp commented 6 years ago

@Anton-Latukha

  1. Linux kernel development is very organized and supported by tooling; that's why it works.
  2. I think the review process has great variability, which would mean by definition that it's not a good progress. The average review engagement is quite high, but this is only one of the factors in determining process quality.
  3. It was "stable" at 200 in a similar way a year ago or so.
jb55 commented 6 years ago

basically everything @oxij said, but I am biased as a notmuch user. The only reason I found this thread is because I tag messages that have notmuch in the body. This type of automation and autotagging is just one of the benefits of running a fully standalone, indexed version of the repo's issue/pr history. My main issue with Github is that they don't provide enough email metadata to tag issues and prs properly :(

Thank you @oxij!

oxij commented 6 years ago

@jb55 You're welcome :)

@ ALL

Btw, meanwhile GitHub Staff fixed the "off-topic" message rendering and id-hash-linking bugs (kudos to me for reporting, kudos to them for fixing :]), so now is the good time to go back and read all of those "off-topic" messages if you haven't got the original notification and haven't read them via the source view yet.

In particular, let me remind you that we still don't have a backup (and, hence, an index) of anything...

As an experiment I poked my head into several PRs since my last message to this thread and posted some links to related work from my index, but that's clearly unmanageable at scale, and I'm missing a bunch of early stuff in my index (and almost everything was invented already in the first 6000 PRs and issues, and I'm missing a bunch of those).

Can anybody with "Member" of the NixOS org check if they can perform https://developer.github.com/changes/2018-05-24-user-migration-api/ ? Maybe it doesn't actually need the "root" access.

7c6f434c commented 6 years ago

Do we have any member that sees all the teams inside NixOS org without being an owner? (I am almost sure I do not)