instructlab / community

InstructLab Community wide collaboration space including contributing, security, code of conduct, etc
Apache License 2.0
70 stars 40 forks source link

[Comment by 2024-05-31] RFC: Should we continue to use Slack for real-time chat and/or look at forums software? #279

Closed jjasghar closed 5 days ago

jjasghar commented 4 months ago

This has been around some of the tech news sources but lifehacker seems to sum it up quite nicely.

We should have a discussion about if a free tier can opt out of this AI model training, and start discussing a plan if we can't turn it off. It looks like most of the content is around companies and how they can opt-out, but free tiers seem to be left out of the conversation.

/cc @joesepi

jimmysjolund commented 4 months ago

Should the discussion be in this issue, GitHub Discussions, or in an email thread?

jjasghar commented 4 months ago

Should the discussion be in this issue, GitHub Discussions, or in an email thread?

We have so many places to have this discussion. Right now it seems people pay attention to GitHub Issues more than anything, so that's why I started here. I know we'll talk about it in the different public meetings and I'm hoping people capture what is discussed there and put it at least here.

The funniness of starting the real first thread in users@instructlab.ai saying we're moving away from slack, is not lost on me either.

jjasghar commented 4 months ago

Just to keep this linked, this was at least on of the issues that we decided to use slack: https://github.com/instructlab/community/issues/107

nathan-weinberg commented 4 months ago

I'd be in favor of keeping Slack but opting out if that's an overall option

russellb commented 4 months ago

I'd be in favor of keeping Slack but opting out if that's an overall option

I’m a huge fan of slack style chat, but free tier slack is not great. Losing history there is like deleting all of our GitHub issues after 30 days. If this helps make the case for a persistent chat service, that’s welcome in my book.

jimmysjolund commented 4 months ago

Slack have posted a clarification: https://slack.com/blog/news/how-slack-protects-your-data-when-using-machine-learning-and-ai

However, it keeps saying "customers can opt-out" and "customer data" and I wonder if users of free tier is considered customers or not? To me a customer pays, otherwise you are a consumer.

jimmysjolund commented 4 months ago

Personally, it doesn't matter to me whether Slack would use any of the discussions for any training or not. It's a public discussion in an open source project. There are no secrets there. It's also a bit funny to me that as an open source project wanting people to contribute to train ilab, at the same time the project would take a stand against anyone else training their LLM (which Slack btw tries to say the won't do).

Main reason for my view to move away from Slack or to another solution is the history. I think it could be beneficial to have the history long term as done historically with mailing lists and IRC archives. And I do appreciate open source tools before proprietary, then again, I also believe in lowering the barriers for contributors to get access and choosing a platform that is common and easy to use (and maybe that is a paid Slack solution).

lhawthorn commented 4 months ago

With the announcement by Slack that they will use user chat history to train their AI, some folks are less comfortable using Slack. The community maintainer team will find out if we are able to opt-out from AI training as we are using the free tier. Will report back in this issue.

bjhargrave commented 4 months ago

I am ok with Slack free tier (with only 30 days history viewable). Any decisions made on Slack need to be memorialized in the proper mail list or GitHub repo.

As a point of reference, this is how the Eclipse Foundation uses Slack.

If we choose to abandon Slack, we should just move all discussion to mail lists. (If google group mail lists are an issue for some, we should move the mail lists to some mailman provider or Discourse.) I am not super-excited about installing another chat client :-/

lhawthorn commented 4 months ago

Since we are noting issues with Slack that include the AI training question and the fact that use of the free tier means we lose chat history after 90 days and the paid tier is cost prohibitive (at current usage $4000 USD/month), we are opening discussion with the community about how they would like to handle real-time chat / free flowing discussions.

Alternatives discussed included:

We also note that access to our email lists requires subscribing with a Google Account, which can be any email address that is registered with Google to make it a Google account, but not every person's employer/institution allows them to register their email address as a Google account. It is also known that current community members work for employers who do not permit them to use Discord for work business.

There are pros and cons to each of these approaches. While we have a goal to grow the InstructLab community to serve people working across a wide variety of disciplines (which is why we have email lists!), we need to make a decision on this matter sooner rather than later.

In our community meeting on May 21, 2024, we noted that for now we need to optimize for our maintainer team so that they can get their work done in the best possible way and preserve those design discussions for the long-term as that history will be a critical part of understanding how the project works and why it works that way.

Please take a moment to contribute your thoughts on what service(s) you like/would prefer/cannot use and any thoughts on how to optimize for our many current needs.

The community maintainer team will review the feedback submitted. We are leaving this matter open for comment until end of day May 31, 2024 (wherever in the world you may be).

Following review of the feedback, we will make a decision before the June 4, 2024 weekly community meeting. We will notify the community of our decision via the June 4th community meeting, Slack, this issue, and the Community email list. We will leave the issue open for comment for another 48 hours following the announcement to give anyone time who has something to point out that they think we missed to comment.

Following that time period, the community maintainer team will migrate the community to whatever solution is chosen, or outline more rigorous guidelines for using our existing communication channels if we go with status quo.

Regardless of the outcome, the community maintainer team will make sure communication channel documentation that is additionally rigorous is published in the community repo, perhaps as an add on to our Collaboration guide.

bjhargrave commented 4 months ago

who do not permit them to use Discourse for work business.

Discourse or Discord? (its annoying they have such similar names!)

nathan-weinberg commented 4 months ago

Since this is such a wide scope and will affect the whole org, could we push this issue to the community Slack channel and mailing list to ensure we capture as much of the currently community's feedback as possible?

lhawthorn commented 4 months ago

who do not permit them to use Discourse for work business.

Discourse or Discord? (its annoying they have such similar names!)

Yes, it is annoying. I think I made a mistake in the issue and people are actually unable to join Discord for work business. (I will edit the issue.)

jimmysjolund commented 4 months ago

My issue with the mailing list is that it requires a Gmail address, om not too keen on registering for one for only one mailing list. I want my email to my usual email.

I don't think changing Slack, real time chat, to Discourse. They are, to me, not relatable. If there would be a forum I would suggest utilizing GitHub Discussions. It's already here, you wouldn't need to register on yet another site, and to use it properly it's a bit costly. We just moved another community from Discourse to Discussions as the cost vs value was not up to par.

It seems there are no perfect solutions, so i would hope for Matrix.

bjhargrave commented 4 months ago

My issue with the mailing list is that it requires a Gmail address, om not too keen on registering for one for only one mailing list. I want my email to my usual email.

You don't need a gmail address. You need a google account and you can create one for an existing email address (which is not a gmail address).

russellb commented 4 months ago

I feel strongly that chat should remain. I would also strongly prefer a chat system where we don't lose access to our history.

If we drop chat, my prediction is that chat just moves to company internal or ad-hoc less official chat locations and not that everyone moves to mailing lists or whatever is set up as an alternative.

russellb commented 4 months ago

who do not permit them to use Discourse for work business.

Discourse or Discord? (its annoying they have such similar names!)

Yes, it is annoying. I think I made a mistake in the issue and people are actually unable to join Discord for work business. (I will edit the issue.)

what's weird is I thought the same concern applied to Slack as well, but here we are ...

joesepi commented 4 months ago

My experience around open source projects/communities trying to be open, transparent and inclusive is that communication and information should be done in the following ways:

  1. Github issues, PRs, documentation, meeting notes
  2. Email lists/discussion
  3. Slack

Those are listed in order of importance.

  1. Github: as much as possible, things should be done in GIthub so it is not only accessible but also versioned. It sometimes takes a bit of effort to take a conversation and memorialize it in an issue or in another way, but its very important to always aim for information to land in Github

  2. Email lists/discussion: should also try to land any decisions or important information in Github where applicable

  3. Slack should be treated as ephemeral and when possible, conversations and information should land in Github where applicable.

And in case it wasn't clear: I think we should continue using Slack but do so with the understanding that 90 days is short and Slack shouldn't be seen as a place where we can go back and review any previous conversations older than 3 months.

hickeyma commented 4 months ago

I agree with @joesepi view point in https://github.com/instructlab/community/issues/279#issuecomment-2123196466.

I also agree (partially) with @russellb on maintaining a chat system as its important to be able to discuss in pseudo real-time, and also asynchronously join a conversation topic when a person catches up. I do think though that chat is for helping out people with their questions and problems, and for having valuable discussions on topics. As a community matures, it is usually providing feedback already documented (docs, issues, PRs etc.) and being helpful to the user. If new valuable information comes to light then that should always feedback into the community through docs, issues, PRs etc. You should not be dependent on history in a chat system. In other words improve the project(s) as we go along so that we are not pointing to old chat history.

russellb commented 4 months ago

I guess I'll record a concrete proposal:

  1. We stop using slack.
  2. We use Discord. It's hosted and free. There are lots of administration and moderation tools we can add on, some with reasonable fees for more advanced features. One I've used is MEE6.

Edit: I'm open to Matrix, too. I haven't used it. I only have a slight preference for Discord because it's a client I'm already using. I understand that as a point in favor of Slack from others. Discord > Matrix > ... other stuff probably >>> Free Slack (losing history)

n1hility commented 4 months ago

I agree that switching to Discord makes the most sense. It has the usability needs we need for a diverse skill set. It has a pricing model that is compatible with our need to scale, and it's popular with other communities.

nathan-weinberg commented 4 months ago

I'm not interested in getting a new chat client and likely won't participate if we migrate to Discord. Not to mention, as said above already, there are current community members who work for employers who do not permit them to use Discord for work business.

I officially propose what @joesepi outlined above - keep Slack, opt out of the AI stuff (somewhat ironically), and primarily use GitHub/Mailing Lists for discussions that need a preserved history.

jjasghar commented 4 months ago

I do want to say that switching to Discord will cause conversations with the "hard liners" in the FOSDEM and FOSSY communities to question our commitment to open source. The idea of using a proprietary system like Discord when the matrix federated network exists will be a friction point.

By creating something like matrix.instructlab.ai, we can then federate to the FOSDEM network and have access to the 6000+ plus Open Source engineers in there and help build out our community via that avenue.

On a personal note, I helped run/moderate the AI and ML room at FOSDEM last Feburary, with easily 600+ people in the room all day Saturday. If we showed that community we were there with Open Source and met them half way with matrix, our friends in the EU would probably be warmer to joining our project.

jimmysjolund commented 4 months ago

I do want to say that switching to Discord will cause conversations with the "hard liners" in the FOSDEM and FOSSY communities to question our commitment to open source. The idea of using a proprietary system like Discord when the matrix federated network exists will be a friction point.

By creating something like matrix.instructlab.ai, we can then federate to the FOSDEM network and have access to the 6000+ plus Open Source engineers in there and help build out our community via that avenue.

On a personal note, I helped run/moderate the AI and ML room at FOSDEM last Feburary, with easily 600+ people in the room all day Saturday. If we showed that community we were there with Open Source and met them half way with matrix, our friends in the EU would probably be warmer to joining our project.

Isn't Slack and Discord in the same category in that sense?

jjasghar commented 4 months ago

Isn't Slack and Discord in the same category in that sense?

Yep, and "we" were overruled by executive decisions when the project was in its infancy and not as "public."

lhawthorn commented 4 months ago

I think Discord makes the most sense, too, but when we say "it is free" - how is it free? I know the Fedora project's Discord server costs Red Hat a decent chunk of money each year, which is why I never chased this down in the first place.

1) Please educate LH as to how we magically get to use Discord for free when Fedora project pays a lot of money for it.

2) In parallel, LH will go find out if we can piggyback on that contract for another instance but this will not be awesome if Fedora decides to stop footing the primary bill and InstructLab would need to then cover that full cost to sustain our use of the service.

3) @jjasghar Do people in EMEA think Discord is not awesome?

Finally, I cannot believe these words are coming out of my mouth but I am having trouble saying "we must use FOSS only tools lest we also peeve our hard liner free software friends." I love my buddies who only use free software but ... GitHub is not FOSS. As such, haven't we already lost their confidences? I think it's hard to argue we should optimize our chat tooling for inclusivity of community members who only want to use FOSS tooling for chat when we create code on GitHub, host our meetings on Google Meet, our community meetings are posted on Google Calendar, etc.

This remark in no way besmirches the view that we should prefer open source tooling for our open source community, but if that's to be part of our default stance on tooling we need to start ripping a lot of things out that I don't think we should be looking to replace.

lhawthorn commented 4 months ago

who do not permit them to use Discourse for work business.

Discourse or Discord? (its annoying they have such similar names!)

Yes, it is annoying. I think I made a mistake in the issue and people are actually unable to join Discord for work business. (I will edit the issue.)

what's weird is I thought the same concern applied to Slack as well, but here we are ...

Ok, that's a new one for me - I am sure some folks are forbidden to use Slack for work purposes, but that's not one I have heard.

@caradelia You likely have the most useful data here as you work with folks in Financial Services and as a highly regulated industry they likely have the most stringent guidelines. Do your contacts not get to use Slack? I know many folks at banks don't get to use GitHub, either, but we're still going to use it. :D

jjasghar commented 4 months ago

Do people in EMEA think Discord is not awesome?

The audiences we're trying to reach are developers and open-source engineers. (at least right now) Most of these humans are strongly open-source focused. Discord was designed for gamers and the ability to build up streaming communities quickly and "normal" (non-engineer types) to join quickly and play video games. The core problem is that everything is not open source. and with the FOSDEM crowd I've learned to know and love, they will bristle at the idea that we are attempting to build an Open Source project and community on Discord.

I'm not saying there hasn't been successful ones, Python, and yes Fedora both have great communities, but if we are attempting to show that we truely are open source, choosing Discord goes against that grain. (or at least from my understanding of this crazy world we live in)

lhawthorn commented 4 months ago

Software is easy, people are hard. - LH, Engineer of Human Systems

Let us continue to discuss and get more input. I am sure we will get where we need to go.

danmcp commented 4 months ago

I am a +1 for Discord. Both Slack and Discord have the closed source downside so we aren't making it any worse by switching. While I agree with @joesepi's communication protocol, it's still nice to have history to be able to search through previous discussions and worth the switching cost to me. The other complaint I see is from those who don't want to adopt another client, but there will probably be many in the community for which InstructLab is their only use case for Slack. For me, additional Slack workspaces are just as difficult to context switch between and multiple chat clients.

jimmysjolund commented 4 months ago

I think Discord would close down on the number of potential diversity of contributors. In that sense, I believe staying on free tier Slack is a better option. Many developers and non-developers already have to use Slack from work or elsewhere. They are familiar with it. I see no benefit in changing to Discord. It's a change for the sake of changing. Neither is FOSS and both expensive unless sponsored. As mentioned, Discord has a more gamer focused audience (true or not) and will often require creating a new login. Sure, that applies to Slack or Matrix as well, but as I said, Slack is probably the most common one of the three.

Yes, GitHub is not FOSS either but that would be a much larger step to change and would potentially put the project out of sight for many. Then again, even a large project and community like Curl is discussing whether to stay with GitHub or not as there have been voices heard complaining about the Microsoft connection.

jimmysjolund commented 4 months ago

My issue with the mailing list is that it requires a Gmail address, om not too keen on registering for one for only one mailing list. I want my email to my usual email.

You don't need a gmail address. You need a google account and you can create one for an existing email address (which is not a gmail address).

While it's true you don't need a gmail address to sign up for a Google account, in (my) reality it proved to be less trivial than expected. I could register a new account, I verified it with my mobile, I joined several mailing lists, but never received any emails. So, while trying to log in to Google with my new account it turns out I once had used my mobile number to verify another account (on my Pixel) so therefore this new one was probably a fraud attempt. I could, however, provide another number or "use a friend's phone" which is not a great idea to do for verifying your account. "No mailing lists for you."

cybette commented 4 months ago

I'm adding my +1 for Matrix. We have used it successfully with the Ansible Community (500+ members in the Ansible space) and I think Fedora has quite good results with it as well (4000+ in the Fedora space). The number of users is probably higher, as you don't need to join a space to join a single room in the space (generally). However joining a space provides you the list of all related rooms in that space, so you can see the different dicsussion topics happening within that community.

When we had the Ansible Lightspeed technical preview, we used a Matrix room for people to join to ask questions and receive news. We had hundreds of people join the room, although most of them didn't join the Ansible space, and that's ok because they were mainly interested in that one specific topic. Room history is preserved and can be made public as well.

You can think of a Matrix space as similar to a Discord server or a Slack workspace. With a single Matrix ID you can join multiple communities/spaces, join individual rooms that may or may not be part of a space, DM people across different spaces and matrix homeservers, all within a single Matrix client or browser window.

Sure if you don't already have a Matrix ID, it's one more account you need to create. But once you enter the Matrix, you'll have access to many communities (Gnome, Mozilla) and cool events (FOSDEM, DevConf.cz) etc. Is there a learning curve? Maybe... but I've been using Matrix since 2017 (coming from IRC) so I may not be the best person to ask. Couple of days ago someone commented in the Ansible social room:

So far I like this communication medium too! :)
I had never used Element before. Its cool!

(Element is a web client for Matrix.) And here's the direct link to that message. I think Matrix is more accessible than people give it credit for, you just need to get used to calling channels rooms and servers spaces. Or homeservers. Ok that part is a bit confusing as you can create a space with rooms from different homeservers. Like I had a private IRC space to collect all the bridged IRC channels across several IRC servers together. But I digress and this is not helping my argument :sweat_smile: I think to get people started in a few main chat rooms is rather straightforward.

HOWEVER... (there's always a but), there are costs associated with it. Instead of creating everything (rooms, spaces) in matrix.org, if we want our own homeserver like matrix.instructlab.ai or simply instructlab.ai, we'd either have to self-host or pay a provider. It's still a lot cheaper than Slack (hundreds $ per month instead of thousands).

cybette commented 4 months ago

I've also created #instructlab:matrix.org mainly to make sure we reserve the room name on the main matrix.org homeserver. Feel free to join and have a look at how things work. There's threading (which I used to dislike but has now learnt to accept) and I really like the cascading read receipts. Don't know what I'm talking about? Join and find out :wink:

Screenshot from 2024-05-24 20-00-37

edmundronald commented 4 months ago

Why not just use discussions and issues on GitHub? It provides newcomers with an instant snapshot of where things are at, and acts as an institutional memory.

One (obvious) idea might be to have an llm helpbot that gets updated as the project progresses to help users solve issues.

Ed

lhawthorn commented 4 months ago

@edmundronald The goal is to meet people in the many places that they are, since InstructLab is useful to people across a wide variety of industries and use cases, and people prefer to actively communicate in different ways. Real-time chat is also a nice way for people to feel connected to one another even though they are working together distributed across the globe, and we can lose that feeling if we rely on just GitHub issues (or even GitHub issues/discussions + mailing lists).

Thank you for the feedback, though, and very fair question.

lhawthorn commented 4 months ago

I think Discord makes the most sense, too, but when we say "it is free" - how is it free? I know the Fedora project's Discord server costs Red Hat a decent chunk of money each year, which is why I never chased this down in the first place.

TIL that Fedora is paying a lot of money for Discourse, not for Discord.

(I would be more grumpy at myself for this error were it not for reviewing past meeting notes and seeing the error written down not by yours truly. Mistakes were made, let's fix it. :)

The following is by no means a decision, but rather a summary of live discussion:

In yesterday's community call, we talked extensively about tradeoffs and real-time chat vehicles. (LH TODO update this issue with link to recording when available in InstructLab YouTube channel.)

I believe that one proposal was floated that would satisfy all the people's needs and desires:

Thank you to @RobotSail for letting us know there is tooling from Protocol Labs that could bridge together Discord, Matrix, and Slack into one great glorious whole. (The key there being the "w." :)

@cybette Would you please work with the Red Hat OSPO communty infrastructure team to scope how this might work, what it would cost, and if they are ready, willing, and able to support such infrastructure for InstructLab? I understand there would be costs associated with e.g. a customer Matrix server like matrix.instructlab.ai, so let us spec out all the options and record them here so folks can get a sense of the scope/expense.

As someone who is largely ignorant of the use of these three services save for Slack, I am also interested in understanding how our chat history is preserved long-term, where it is logged, if any of these services will monetize our community members' data somehow or use it to train AIs (our current gripe with Slack), and how we might be able to export our history and replicate it elsewhere should a particular provider cease operations.

@jasonbrooks @mscherer Thank you for taking a look at this matter with @cybette. We appreciate you!

And last but not least, I personally have some friends who are university researchers that are interested in what we have done so far with InstructLab. Once upon a time - it was ages ago, I think back in 2018? - I attended a conference in Berlin with several university researchers into open source and they commented how hard it was to do research on open source projects now that they were moving away from IRC, which often had openly accessible and easily parseable logs. Given both current interest from researchers and our roots as a research project, I would be truly delighted if we could understand how our logs might be easily accessible for those who would want to write academic papers about our community and its work. None of the outcomes of that inquiry should be show stoppers, but I think we would be remiss not to consider this use case for our community data. Maybe the CHAOSS project has given some thought to this matter.

edmundronald commented 4 months ago

@lhawthorn There are many ways to communicate, and all of them are useful and allow people to feel connected. I see that every day with my teenager who hates it when I call him by voice, while my older friends like it. Regarding software, for myself, I find forums the way I feel most connected, and I especially like the fact that I can locate someone by "accident" who was interested in something several months ago, and search out that person.

There might be several sub communities here, eg, users and maintainers and possibly "forkers". The maintainers might wish a feeling of coordinated cooperation while users want a more detached experience.

Regarding myself, I am really old (6X), I used to teach AI, I saw a post on medium and looked at the paper, I really liked the base concept, downloaded the software and fought for a day with it to get it to run (the time to realize the python on my M3 was Intel rather than Arm, an issue which could be detected as an explicit error), then I got it to train but the training doesn't seem to be very useful, it doesn't solve the "Betty is 20 years older than Joey, in two years Betty will be twice as old as Joey" question, and I don't exactly know what files it has create how I'm supposed to tell it what to overlay over the base model etc.

Maybe a tool of some sort which tells you which models lab is seeing, what overlays, what training it can see has already been done to create these overlays etc? It seems to me that additional metadata files could be written into the taxonomy hierarchy, and that in some ways training and retraining a model with various skills or knowledge for variable amounts of time creates an inheritance scheme between models, eg I have model M and skills A and B, it's not obvious at all that M+A+B= M+B+A in difficulty of training or even in end behavior ...

I guess I should either register on Slack or come back in a few months when the documentation will have improved? If there is a documentation stub I could contribute to its improvement ...

Edmund

@edmundronald The goal is to meet people in the many places that they are, since InstructLab is useful to people across a wide variety of industries and use cases, and people prefer to actively communicate in different ways. Real-time chat is also a nice way for people to feel connected to one another even though they are working together distributed across the globe, and we can lose that feeling if we rely on just GitHub issues (or even GitHub issues/discussions + mailing lists).

Thank you for the feedback, though, and very fair question.

jimmysjolund commented 4 months ago
  • Start pushing folks to use our mailing lists more extensively, anyway, since pretty much everyone can get email and make use of it, where as the above three tools may be inaccessible or undesirable for some

Though it turned not to be so easy even with an existing email unless you want to use an existing Google account or sign up for a new one. Even signing up for a new account but using a non gmail address, if you have ever used your mobile phone for a Google service some time in the past, you might run into the trouble I did where I'm not allowed to use the same mobile number as it has already been used once in the past. Hence, Google won't let me sign up to the mailing list with my preferred email.

So, while everyone can get email, we can maybe not get email through a Google service though.

edmundronald commented 4 months ago
  • Start pushing folks to use our mailing lists more extensively, anyway, since pretty much everyone can get email and make use of it, where as the above three tools may be inaccessible or undesirable for some

Though it turned not to be so easy even with an existing email unless you want to use an existing Google account or sign up for a new one. Even signing up for a new account but using a non gmail address, if you have ever used your mobile phone for a Google service some time in the past, you might run into the trouble I did where I'm not allowed to use the same mobile number as it has already been used once in the past. Hence, Google won't let me sign up to the mailing list with my preferred email.

So, while everyone can get email, we can maybe not get email through a Google service though.

A solution might be a composite of several sub solutions :) I notice that GitHub is mailing this thread through to me to gmail without my signing up specifically to a mailing list, which is actually useful to me :) Other people on Github might already have some sort of gated-through combination of methods that works?

Edmund

russellb commented 4 months ago

Before adopting a multi-chat-system bridge, I'd like to understand clearly what the tradeoffs are. Systems like that tend to end up supporting the least common denominator of features across the supported services. I haven't used this specific service before though, so I'm not judging it yet. I just want that option to be approached with appropriate caution and skepticism.

RobotSail commented 4 months ago

Before adopting a multi-chat-system bridge, I'd like to understand clearly what the tradeoffs are. Systems like that tend to end up supporting the least common denominator of features across the supported services. I haven't used this specific service before though, so I'm not judging it yet. I just want that option to be approached with appropriate caution and skepticism.

My understanding of slack <--> discord is that there is feature-parity when it comes to communication, e.g. message reactions, threads, mentions. I'm not sure about the Matrix bridges though.

jimmysjolund commented 4 months ago

My experience with bridges have been that they break and require maintenance. While I understand the notion of making it easy for people to engage, there's also a line somewhere where trying to meet people everywhere ends up being too scattered. I'm also wondering what the goal and audience is? In previous chats and the community meeting we have been talking about "open source people", "ai people", and the maintainers. Who is our main focus here? I was hoping we are looking for engaging ALL THE PEOPLE meaning it's not "just" the open source or already AI savvy folks, but random/usual people that have knowledge and skills useful for InstructLab. I don't want us to end up with a service that is biased towards "here are the skills from the FOSS and AI people". The idea, to me, is to get the broader audience engaged, to make a difference towards every other initiative. If that comes later and we need to focus on maintainer or the AI crowd now, what is a good solution that would allow for the other people later on?

bjhargrave commented 4 months ago

Discord’s turning the focus back to games with a new redesign

So this appears to be a negative for moving from Slack to Discord.

edmundronald commented 4 months ago

My experience with bridges have been that they break and require maintenance. While I understand the notion of making it easy for people to engage, there's also a line somewhere where trying to meet people everywhere ends up being too scattered. I'm also wondering what the goal and audience is? In previous chats and the community meeting we have been talking about "open source people", "ai people", and the maintainers. Who is our main focus here? I was hoping we are looking for engaging ALL THE PEOPLE meaning it's not "just" the open source or already AI savvy folks, but random/usual people that have knowledge and skills useful for InstructLab. I don't want us to end up with a service that is biased towards "here are the skills from the FOSS and AI people". The idea, to me, is to get the broader audience engaged, to make a difference towards every other initiative. If that comes later and we need to focus on maintainer or the AI crowd now, what is a good solution that would allow for the other people later on?

I think this very reasonable question affords an obvious answer: See from where people came in. If we look at this as a GitHub discussion the the obvious answer is we should already be maximizing our use of the available GitHub tools, which incidentally are emailing me very nicely thank you every time somebody posts a comment here :)

Ed

mscherer commented 4 months ago

Reminder that Slack and Discord ToS might put limitations on bridging with Matrix (or with people who have not accepted their ToS). There is also COPPA compliance, as Slack and Discord are under US laws, but Matrix.org apply UK/EU ones. Or OFAC ones, for the same reasons.

lhawthorn commented 3 months ago

@mscherer Thank you, I knew there was a reason we thought this was a bad idea.

cybette commented 3 months ago

Thanks to everyone who has provided feedback here and through other channels.

TL;DR - My recommendation: Matrix as the real-time chat platform for InstructLab community

The full analysis and reasoning is available in a public doc, and also reproduced here:

Analysis from discussions

(Discussions on Github, Slack (RH Internal), and Community meeting)

Platform
Pros Cons
Slack Familiar tool for most
Popular among software developers
Inertia (we are already using it)
Encourage people to use ML for discussions (to preserve history)
Using data for AI training / privacy issues
No history (unless we pay)
$$$$ / month
Need to renew invite link every 30 days
Proprietary
Severe moderation issues
* ML usage requires google account
Discord Has history
Free
Pretty good moderation
Popular for gamers and AI communities
Proprietary
Severe privacy issues
Not permitted by certain employers
Yet another chat app
Disliked by FOSS communities
Turning focus back to Games
Matrix History + data control
Excellent privacy
Pure open source
Decent moderation (need to set up tools such as Mjolnir)
Federated, reaching wide FOSS communities (not restricted to a server / workspace)
OSPO has experience managing Matrix for communities (Ansible, Devconf.cz)
* We have Red Hatters in the Matrix foundation (board member)
Some effort needed to self-host or pay for hosting ($$$ / mth)
Not familiar to people outside FOSS communities
Learning curve for new users
Yet another chat app to install for most people

Points to consider

Do we optimize for maintainers (discussions with history), or our target contributors (open source focused communities), or AI savvy crowd? Keep in mind that new contributors may become maintainers down the line.

Or do we want to engage as widely as possible? (however, none of these tools will probably qualify, as each has a certain focus/targeted audience)

Using Matrix and bridging with Discord and Slack was considered, however Slack and Discord ToS might put limitations on the bridging. There is also COPPA compliance, as Slack and Discord are under US laws, but Matrix.org apply UK/EU ones.

Before we get to the proposals, I think we can rule out Discord completely because some of its cons are showstoppers. Free AND retains history? Of course there’s a catch -> severe privacy issues which is why it’s blocked by some employers / banned in certain countries.

Proposals (possible solutions)

Switch to Matrix

Go full-in with Matrix. It’s not free, but the cost will be around a tenth of paid Slack. It has a lot of advantages, and we have people in OSPO with experience managing Matrix servers and rooms. We might get some initial resistance from some people joining (which makes up most of the cons list) but in the long run there’s less restriction to the growth of the community. Even if Element (the company behind Matrix) shuts down, we still have control over our history + data, can migrate to a new host server, and continue chatting with the rest of the Matrix fediverse.

Slack + Matrix

Keep Slack for now and create a few (maybe 2 - 4, but no more than 5) rooms on matrix.org as a start. No additional cost, can reach different groups of people we might want to engage with, but at a cost of segregating the community and diluting the attention of community leads who need to keep an eye on all the communication channels. And we’ll probably end up with the same discussion again after a while – of whether we should fully switch over. (And any major changes like this can disrupt the community.)

Remain on Slack

Stay with Slack and encourage the use of mailing lists on Google groups to preserve discussion history. No disruption to the status quo. But even if we leave the history issue aside, the privacy and moderation issues will be limiting the healthy growth of the community.

My recommendation

I think the best course of action for the InstructLab community is to switch from the current free-tier of Slack to Matrix. In addition to the points I’ve made in the first proposal above, I’ve seen it work well for communities such as Fedora and Ansible, the latter in which I promoted and helped with the implementation of Matrix as the main communications platform.

To address possible concerns, I will:

I believe that for the long term well-being of the InstructLab community, Matrix will provide a reliable platform for real-time communication needs, where we can retain the chat history and have control over our community’s data.

(1) We can export data from public channels and use one of the following possible tools to convert the json export to html: (to be tested)

(2) For reference, hosting ansible.im with EMS costs < 450 USD per month. The devconf.cz matrix server is hosted by OSPO so the cost is part of CommInfra’s operational expenses.

lhawthorn commented 3 months ago

Speaking in my role as an oversight committee member:

I am +1 on migrating from Slack to Matrix with the note that we should export available project history and preserve for future learning.

I am +1 on not also choosing to implement Discord because yet another place to talk does not feel helpful to me at this time. I am concerned that if much AI chatter happens on Discord we may be losing some people but we can also choose to experiment with another platform later.

I am +1 on encouraging more use of the mailing lists but for now most of our user population seems very Slack focused. That's cool, we can see how that changes as we talk more about InstructLab at conferences and meetups.

And I am +1 on using our event presence to also find out where people who are interested in InstructLab also "hang out" because it helps us answer questions like Discord (y/n) or sister communities where we might want to learn more about each other's projects, etc.

And lastly thank you to @cybette for her work on this proposal.

If we are to move forward with Matrix to Slack migration, we will need to arrange for support from OSPO Community Infrastructure Team for hosting, a timeline to announce and help people move over, accept some time period of having both services available to aid in the cutover, etc.

I think that goes without saying, but I have also learned over the course of time in communities that you really cannot overcommunicate. 😃

russellb commented 3 months ago

Go full-in with Matrix. It’s not free, but the cost will be around a tenth of paid Slack.

It's not explicitly called out, but does this come with a commitment to pay for it? What do we get with paying vs free? our own instructlab.ai space?