Tatoeba / tatoeba2

Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
https://tatoeba.org
GNU Affero General Public License v3.0
697 stars 132 forks source link

The Wall is not appropriate any more for the community to interact #2519

Open jiru opened 4 years ago

jiru commented 4 years ago

Story

(See also the thread on the Wall).

The Wall was born in late 2009. I wasn’t there at that time, but Tatoeba was certainly quite different. To give some subjective historical context, I’d like to quote a message from one of our early members:

I joined tatoeba back in 2010 and I've contributed a bit to the arabic corpus. Things were very different then. It was packed with students from all over the world who really were enthusiastic about free and open source software and data, enthusiastic about other cultures and languages, and enthusiastic about the possibilities that this project represented and stood for. Excitement was in the air, we'd have fun discussions in the comments or on IRC that end up as sentences, we'd create series of them and people would get busy translating those pieces of works into all the languages we knew, we paired up and added sentences for each other, we'd even have a "tatoeba day" every once in a while where we'd improve the entire corpus quite a bit and set new stat records, we had weekly updates, [...]

To put his message in perspective, I compiled some statistics about the growth of the community and the Wall:

Date Members having contributed[1] 90% of contributions made by Total wall posters[2] Total wall posts and replies Total wall threads
2010-01-01 176 25 (14.2%) 19 (10.7%) 95 26
2010-03-10[3] 269 32 (11.9%) 36 (13.3%) 294 75
2011-01-01 1512 143 (9.5%) 191 (12.6%) 4140 849
2011-03-12[4] 1772 167 (9.4%) 226 (12.7%) 4822 1001
2012-01-01 3178 240 (7.6%) 377 (11.8%) 8037 1565
2013-01-01 4902 271 (5.5%) 621 (12.6%) 12036 2308
2014-01-01 6576 295 (4.5%) 819 (12.4%) 14769 2825
2015-01-01 8060 310 (3.8%) 994 (12.3%) 17538 3429
2016-01-01 9807 313 (3.2%) 1149 (11.7%) 20867 4037
2017-01-01 11104 322 (2.9%) 1236 (11.1%) 23005 4500
2018-01-01 11935 321 (2.7%) 1290 (10.8%) 23826 4743
2019-01-01 12918 318 (2.5%) 1368 (10.5%) 25818 5172
2020-01-01 13735 331 (2.4%) 1441 (10.4%) 28261 5622
2020-08-24 14444 339 (2.3%) 1496 (10.3%) 30007 5949

[1]Members who had contributed at least one sentence or translation. [2]Members who had posted at least one Wall post or reply (percentage is relative to [1]). [3]Date the above-quoted member joined Tatoeba. [4]Date the "All languages equal" banner appeared.

From this table, we can say that:

So when this member joined Tatoeba in 2010, I believe the Wall was well sized and well suited for the community to interact. A kind of cosy space where everyone knew each other. However, the structure of the Wall didn’t change at all since then. Given the growth of the community, I don’t think it is appropriate any more. If the Wall somehow kept this sense of cosiness, it is at the expense of excluding a lot of members by design.

So I identified a few shortcomings with the Wall:

Idea Let’s discuss to find ways around these limitations. From there, we can create new tickets to describe and implement solutions.

AndiPersti commented 4 years ago

Let’s discuss to find ways around these limitations. From there, we can create new tickets to describe and implement solutions.

Well, in 2014 you sounded a little bit different. :smile:

I fully agree with your younger ego and would prefer using a plugin/external forum. There was already enough of NIH in the past.

mramosch commented 4 years ago

I was reading the UX 1-4 tests and was wondering whether there are more available somewhere, because this is some of the more interesting stuff I have ever found on Tatoeba discussions...

jiru commented 4 years ago

@mramosch Unfortunately, no. UX tests 1 to 4 were done by myself. Trang and I have invited people to do more of them on the Wall but so far we haven’t heard from anybody. You are welcome to do more UX tests! :wink:

ckjpn commented 4 years ago

A traditional web-based forum would likely accomplish a lot of what you mention.

https://github.com/Tatoeba/tatoeba2/issues/367 Suggested in 2014.

LBeaudoux commented 4 years ago

Discourse and Flarum seem to be the top choices for adding forum features to an app.

jiru commented 4 years ago

Flarum is not ready for production yet.

Discourse is more mature and has lots of plugins, but it’s written in Ruby on Rails, which means a lot of new things to learn and to deal with if we want any kind of customization. Discourse also requires an outgoing MTA, which we don’t have yet (although it would be a good thing anyway).

As for CakePHP plugins, the options are pretty limited. Even the CakePHP people are using using Discourse. The best thing I could find is the one from CakeDC. Not particularly fancy, but it does the job and looks maintained.

LBeaudoux commented 4 years ago

I took a closer look at CakeDC and Discourse.

I have the impression that CakeDC does not offer multilingualism and search as requested by Thanuir and Guybrush88 on the wall. It seems however that Discourse allows search and is even SEO optimized. Moreover, its UI already supports many languages and it is possible to add a Google Translate plugin to easily translate posts written in languages that a user doesn't understand. More generally, Discourse has many more features, more integration capabilities with other apps (e.g. Github) and a much more vibrant community than CakeDC. For a 6-year old open source project, Discourse is already very mature and I feel that it is becoming the new standard.

From a more technical point of view, it seems to me from this video tutorial that it is not necessary to dig into the code to customize the application since the settings are configured on an administrator panel. The most delicate point is probably the implementation of Single-Sign-On. As jiru has already mentioned, it is also necessary to set up an MTA that could also be useful for other features.

With Discourse, the short term cost is quite high but I'm afraid that by choosing CakeDC, the vitality of the forum will be disappointing and that the maintenance burden will be heavier than expected. The worst case scenario would be to have to later migrate the forum data from CakeDC to Discourse or any other platform.

jiru commented 4 years ago

@LBeaudoux You have a point. I don’t know if we can afford the long-term maintenance of a custom forum implementation, be it based on the CakeDC plugin.

Having a proper MTA may also help solving #2303.

Discourse’s SSO page says that we need to validate emails before sending them to Discourse (the usual "click on the registration link from the email you received"). We currently do not validate emails so this would be yet another thing to implement (but it would be good to do so anyway).

Thanuir mentioned Vanilla which also looks promising. It is PHP/mysql based and looks quite active, although commits suddenly stopped about 2 months ago, I’m not sure what’s happening to them. They also recommend validating email addresses for SSO.

alanfgh commented 4 years ago

There's a thread on the Wall that links to this discussion, but there is no link in the opposite direction, so I'm adding one.

In that discussion, @Guybrush88 says that the most important thing to him is a proper search for the Wall ( #2160 ). I agree.

My question is what role the Wall would play if you added a separate forum. Would you get rid of it? If you did, how would people convey general-interest messages (for instance, "Feature X is broken" or "Feature Y has been added") to members of the community? (The banner comes to mind as a place for displaying some such messages, but they can only be posted by a very small number of admins, must be brief, and cannot include responses.) If you didn't get rid of the Wall, how would you handle the fact that people would want to continue to post on it? Would we have to do what moderators continually have to do on Stack Exchange, namely move lengthy threads to another place on the site while nagging people to hold their ongoing discussions there in the first place?

jiru commented 4 years ago

@alanfgh Thanks. I updated my top post to include the link too. It is worth mentioning that CakeDC doesn’t allow searching while the other solutions we mentioned do.

My question is what role the Wall would play if you added a separate forum. Would you get rid of it?

I don’t think we want to keep both a new forum and the Wall active at the same time because of the reasons you mentioned. We could close the Wall and import its contents into a category of the new forum. But since Wall threads don’t have titles, it would be hard to browse: the best title we can come up for all Wall threads is probably "Thread 1234".

If you did, how would people convey general-interest messages (for instance, "Feature X is broken" or "Feature Y has been added") to members of the community?

I’m not sure I understand your question. I think the point of having a forum is not to have unclassified posts any more. "Feature X is broken" could go under the category "Bug reports" while "Feature Y has been added" could go under the category "Announcements". These are not general-interest because not everybody is interested into reading that kind of content. Only developers and bug reporters want to know about bug reports. Only highly-involved members want to know about new features. In contrast, with a well-organized forum, group communication is more efficiently directed between people having interest in a particular topic. We can handle much more diverse discussions while keeping a low noise ratio for everybody.

LBeaudoux commented 4 years ago

Thanuir mentioned Vanilla which also looks promising

Vanilla seems to primarily address corporate customers who typically integrate forums to WordPress-based branding websites. Many key add-ons are not included in their Community Edition and only available in their hosting plan.

Discourse seems more involved into FOSS and more in line with the needs of less commercial communities like Tatoeba. Not surprising knowing that one of its founders previously founded Stack Overflow.

Informative "Discourse vs Vanilla" comparisons are available on Slant and Similartech.

alanfgh commented 4 years ago

Only developers and bug reporters want to know about bug reports.

That's true for small bugs that only affect a small number of people. However, there can be disruptive bugs that most people will want to know about (though fortunately these have been pretty rare).

Only highly-involved members want to know about new features.

It depends on what you mean by "highly-involved". If, say, the vocabulary request feature were vastly improved, then users who only use Tatoeba's search capabilities wouldn't care. But anybody who adds sentences, even casually, might want to know about it.

In contrast, with a well-organized forum, group communication is more efficiently directed between people having interest in a particular topic.

It's important to recognize the tradeoffs. Yes, there's efficiency to be gained when the subject is not of interest to everyone, but there's also something lost when discussions don't reach the people they might have because they're not aimed at everyone.

jiru commented 4 years ago

@alanfgh Thanks for bringing relevant non-technical points into the discussion. I think it’s very important to think this through, not only to solve the problems I mentioned but also to get support from existing Wall users. I definitely don’t want to just throw a Discourse instance and say "here you go!".

It's important to recognize the tradeoffs. Yes, there's efficiency to be gained when the subject is not of interest to everyone, but there's also something lost when discussions don't reach the people they might have because they're not aimed at everyone.

I can see how the Wall makes it easy to go through all the ongoing discussions and read something unexpected yet interesting/useful. But I think this comes with an even more serious tradeoff: veteran members, blatant posters, tech savvy users, English speakers, or people who otherwise feel confident enough to enter the discussion space do take all the space, leaving little to no space for others.

I’m not really in favour of messages aimed at "everyone". If any, what messages should be aimed at everyone? This is highly subjective if you let the sender decide. Let’s say I am new and I want to ask a question or make a comment about Tatoeba. Is my message so important that it ought to reach everyone out there? "Who am I to trouble this well-established community with my stupid message?" may think the unwelcomed* one. "I have such a great idea everyone should know about it." may think the overconfident one. As a result you mostly get messages from the latter kind. It really is about being inclusive to foster diversity among our members. And I think the structure of a discussion space can help (or worsen) inclusiveness.

By the way, real walls (on which people freely put their ads) suffer from a similar problem. Such walls are full of people in need of attention.

This reminds me the Berber/Kabyle flag flamewar where Trang stopped the discussion by saying that "[they] had enough attention for the past weeks". I think the current structure of the Wall worsened the flamewar because it is a space where members can display their opinions to everyone, the same way one would stick a political poster on a real wall.

[*] I mean somebody who feels unwelcome for any reason, such as being inexperienced, too different, too uneducated, not speaking English, etc.

jiru commented 4 years ago

On the technical side, maybe we could have a special page of the forum that would display the latest threads among all categories in a similar fashion as the Wall does. It would be a just a special "read-only" page to browse all the threads at once.

SeaLiteral commented 2 years ago

But since Wall threads don’t have titles, it would be hard to browse: the best title we can come up for all Wall threads is probably "Thread 1234".

Maybe include the first 100 or so characters of the first post in the thread, replacing newlines with spaces? I also considered stopping at the first newline, but that wouldn't work well if someone wrote "Hello!" and then a newline and then everything else.

ckjpn commented 2 years ago

Related: https://github.com/Tatoeba/tatoeba2/issues/367