rust-community / team

the Rust Community Team 🦀⚙️✨
https://www.rust-lang.org/en-US/team.html#Community-team
55 stars 8 forks source link

Localization team discussion issue #178

Open sebasmagri opened 7 years ago

sebasmagri commented 7 years ago

During RustDay in Mexico City, erickt, brson and I discussed about the benefits of having a dedicated localization team in the project.

The goals of such a team would include:

What else should we consider? What examples from other communities and open source projects can we use?

GuillaumeGomez commented 7 years ago

A long time ago, I opened a PR about adding localization for rustc itself. It was way too early but might be worth being discussed again?

sebasmagri commented 7 years ago

@GuillaumeGomez Sure thing... the idea of this issue is to start gathering feedback and ideas to define goals for the l10n team. rustc is definitely one of the targets.

skade commented 7 years ago

I would also try to get a list of currently running translation efforts and get in touch with the people doing them.

carols10cents commented 7 years ago

Translation efforts for the book are listed in this appendix and they each have an issue labeled Translations.

carols10cents commented 7 years ago

Also this is the mdBook issue for multilingual support.

spastorino commented 7 years ago

I was talking with @sebasmagri and I think I'd focus on educational resources and the tooling ecosystem part for now.

dvigneshwer commented 7 years ago

We should also consider reaching out to the existing experienced Mozilla l10n community members for help, contributions, and guidance. They use a lot of cool tools like transifex etc.

sebasmagri commented 7 years ago

Thanks for the suggestion @vigneshwerd. We're definitely willing to get in touch with them for this initiative.

I would also like to gather some feedback from the Chinese community. cc @KiChjang @tennix, @wayslog, @3442853561, @zonyitoo

KiChjang commented 7 years ago

Also cc @KaiserY.

KiChjang commented 7 years ago

I believe the most up-to-date Chinese translation efforts are still in https://github.com/ctjhoa/rust-learning/blob/master/zh_CN.md. In particular, I think RustPrimer is being used by quite a lot of people in the Chinese community to learn Rust.

@KaiserY has just told me that they've done translation of the Rust book 2nd edition up to chapter 19. More details in his repo: https://github.com/KaiserY/trpl-zh-cn.

skade commented 7 years ago

@sebasmagri Can #124 and #125 be folded into this?

sebasmagri commented 7 years ago

@skade yep, I think those issues should be part of this. Lets fold it and then we can reopen issues with the requirements well defined in the localisation repo.

ariasuni commented 7 years ago

Two issues where created a long time ago concerning localization crates: https://github.com/rust-lang/rfcs/issues/822 and https://github.com/rust-lang/rust/issues/14495. I would definitely want to contribute to discussion and code about this.

sebasmagri commented 7 years ago

cc @hngnaig for Vietnamese efforts. 👍

3442853561 commented 7 years ago

I thought it would be nice to have a multi-lingual document annotation, although I didn't figure out how to do this

Manishearth commented 6 years ago

One thing worth mentioning is that stuff like i18n of the compiler is a really tricky business if we want to do it right. Languages are hard, and building systems that support all of them (e.g. supporting the 6 different kinds of pluralization Arabic has) is a tricky business. If we find a good i18n library in Rust we can use that, but it's likely we'll have to build our own.

We probably should focus on organizing community to translate docs/etc (and organizing these translated docs themselves, we already have a couple), and once this is bootstrapped we can look into i18ning the compiler.

(It seems like the above proposal is very much in line with this, just wanted to reiterate why we should do it that way)

skade commented 6 years ago

FWIW, it might still make sense to try to get a good i18n project on the road :).

On 23. Dec 2017, at 05:18, Manish Goregaokar notifications@github.com wrote:

One thing worth mentioning is that stuff like i18n of the compiler is a really tricky business if we want to do it right. Languages are hard, and building systems that support all of them (e.g. supporting the 6 different kinds of pluralization Arabic has) is a tricky business. If we find a good i18n library in Rust we can use that, but it's likely we'll have to build our own.

We probably should focus on organizing community to translate docs/etc (and organizing these translated docs themselves, we already have a couple), and once this is bootstrapped we can look into i18ning the compiler.

(It seems like the above proposal is very much in line with this, just wanted to reiterate why we should do it that way)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

KiChjang commented 6 years ago

Indeed, I do see that there has been several discussions about i18n, but we've never moved beyond words, and I think the reason is pretty much because nobody knows how to kick start an i18n project for the Rust compiler. Coupled with the fact that such a change requires an RFC, it makes the task more daunting.

Manishearth commented 6 years ago

I don't think working on an i18n library needs an RFC, it just needs a lot of work.

On Dec 23, 2017 4:18 PM, "Keith Yeung" notifications@github.com wrote:

Indeed, I do see that there has been several discussions about i18n, but we've never moved beyond words, and I think the reason is pretty much because nobody knows how to kick start an i18n project for the Rust compiler. Coupled with the fact that such a change requires an RFC, it makes the task more daunting.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rust-community/team/issues/178#issuecomment-353719652, or mute the thread https://github.com/notifications/unsubscribe-auth/ABivSFGnlj1-hOi13RBZqV40QFmzM8ZRks5tDNqTgaJpZM4OV_1R .

psychoslave commented 6 years ago

@sebasmagri invited me to join this discussion following a conversation on the internationalization of Rust itself.

In a nutshell, the proposal is to allow localized source code happen, identifiers and keywords included. So English, or something like "EN-Rust", might still be used as the preferred default locale, while allowing others locales to operate with the level of integration, especially for debugging/profiling sessions. Plus some tools might help to make quick translexicalisation from one locale to an other, in case of migration, or willing to post some snippets when requiring help from an other linguistic community.

sebasmagri commented 6 years ago

Hi! It's awesome to finally get more feedback here. So I'd like to do a quick review for those who are joining the conversation.

The rustc i18n of error messages is something that has been discussed many times. However, a final solution to this has not been identified. There is no centralized evidence of the different opinions on this front, though. It's all scattered in several issues and comments in RFCs and PRs.

OTOH, there are a bunch of initiatives in the ecosystem to provide good quality and standards based crates for ICU and l10n/i18n. However, there is little communication between them and I think they could take advantage of having a common ground. This common ground will probably be provided by the compiler since it's gonna need it anyway for its internal l10n/i18n support, but it doesn't need to support all the features that fully pledged libraries support. I like to think of this working in a similar way to the log crate; providing base traits and a reference implementation used in rustc and std.

The other part of this is resources and documentation, namely The Rust Programming Language book, which means improving mdBook's support for i18n, and rustdoc support for multilingual docs. Efforts to translate TRPL, for example, has been tracked by @carols10cents, yet it needs more coordination if we want to establish having the official docs translation as a goal.

So, I'd like to invite you all to provide feedback to an in-progress preRFC for a Localisation Team, which considers all the different fronts n which we'd need to work in l10n/i18n in the broader Rust project.

Thanks!

psychoslave commented 6 years ago

@sebasmagri I'm not sure my suggestion regarding enabling internationalisation and localisation of Rust itself would have its place in this preRFC. Would you be kind enough to confirn or infirm that I should come add this topic in this conversation?

Apart from that, speaking about communication and common ground, I started a research project on Internationalisation of Programming Languages. I will add a Rust section in the part about state of the art. Everyone is welcome to join this wiki project to enrich it on Rust on any other programming language, as long as it is to talk about the topic treated in the research of course.

Kind regards

Manishearth commented 6 years ago

Localizing language keywords itself does not belong here.

On Dec 24, 2017 2:10 AM, "psychoslave" notifications@github.com wrote:

@sebasmagri https://github.com/sebasmagri I'm not sure my suggestion regarding enabling internationalisation and localisation of Rust itself would have its place in this preRFC. Would you be kind enough to confirn or infirm that I should come add this topic in this conversation?

Apart from that, speaking about communication and common ground, I started a research project on Internationalisation of Programming Languages https://en.wikiversity.org/wiki/Internationalisation_of_Programming_Languages. I will add a Rust section in the part about state of the art. Everyone is welcome to join this wiki project to enrich it on Rust on any other programming language, as long as it is to talk about the topic treated in the research of course.

Kind regards

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rust-community/team/issues/178#issuecomment-353746980, or mute the thread https://github.com/notifications/unsubscribe-auth/ABivSLFUa-fh4FIwLKpFFjYBx4VzxsRyks5tDWVMgaJpZM4OV_1R .

psychoslave commented 6 years ago

OK, then it is a topic which does belong on any open issue, or that a new issue on this topic is welcome, just let me know. Otherwise I will stop to add further comment on this topic on this repository, and will only follow update on the discourse thread.

By the way, you might be interested with GopherCon 2017: Aditya Mukerjee - Translating Go to Other (Human) Languages, and Back Again - YouTube

I also discovered that Perl 6 slangs open large flexibility regarding what is parsable to feed the underlying interpreter, all through native facilities as far as I can judge from reading the doc.

feefladder commented 8 months ago

Heyy, sooo.... here's my 2 cents.

the link @psychoslave shared answers the question from the forum: translation would happen at the level of the lexer. I don't exactly know how strict cargofmt is, but it could very well be possible to specify a locale (or verbosity level) there?

Furthermore, it'd be very nice for crate owners to ensure consistent naming, if they have to define naming themselves. Take this random sample of source code:

(GenericFraction::Rational(sb, vb), GenericFraction::Infinity(se)) => {
  if self.is_one() {
    Ok(GenericFraction::NaN)
  } else {
    match (vb < 1, se) {
      (true, Sign::Plus) => Ok(GenericFraction::zero()),
      (false, Sign::Minus) => Ok(GenericFraction::zero()),
      _ if sb.is_positive() => Ok(GenericFraction::Infinity(Sign::Plus)),
      _ => Ok(GenericFraction::NaN),
    }
  }
}

To the writer, it makes a lot of sense that vb means Value of the Base, since we are calculating a power here. However, for a person who just comes into the crate, it is difficult to understand what is going on here. What could be done is: in my enum:


enum fraction {
  Rational(Sign, Rational)
}

fn some_function_that_is_very_intricate_and_uses_values_a_lot_so_i_have_short_names(input: Fraction, output: Fraction) {
  /// naming scheme: TYPE_VAR_SUPER_SHORT = { <type_name>[0]<var_name>[0] }
  match (input, output) {
    (Rational(si, ri), Rational(so, ro)) => {
      // perform calculations on si ri so ro
    }
  }
  /// end TYPE_VAR_SUPER_SHORT
}

then later, I can change my naming convention if I am confused about my own source code. like: cargofmt TYPE_VAR_SUPER_SHORT={<type_name>_<var_name>} to get:


enum fraction {
  Rational(Sign, Rational)
}

fn some_function_that_is_very_intricate_and_uses_values_a_lot_so_i_have_short_names(input: Fraction, output: Fraction) {
  /// naming scheme: TYPE_VAR_SUPER_SHORT = { <type_name>_<var_name> }
  match (input, output) {
    (Rational(sign_input, ratio_input), Rational(sign_output, ratio_output)) => {
      // perform calculations on sign_input ratio_input sign_output ratio_output
    }
  }
  /// end TYPE_VAR_SUPER_SHORT
}

which would also write to some file translexications.log:

TYPE_VAR_SUPER_SHORT = { <type_name>[0]<var_name>[0] } -> {<type_name>_<var_name>}

So that when I commit, this gets reverted. (or not and it will be a big mess, bc pre-commit hooks haven't been set up properly)

The important part here is scoping: I promise that all functions in this part (or maybe the file) will adhere to that naming scheme. blabla isomorphic translations blabla

Manishearth commented 8 months ago

I also think that is out of scope for this issue.

feefladder commented 8 months ago

ohw sorry, I thought it could be a stepping-stone for localization of rust source code, but didn't explain it properly. The basic idea is: Once there is a way to define how identifiers are named and formatted by cargo fmt, it should be possible for my Dutch colleague to translate identifiers to Dutch. Thus allowing for multilingual crates. e.g.

// T = "Type"[0]
// name = { "My" + "Structure"[0..4] }
impl<T> MyStruct {
  // some macro doing:
  // function name = {"power"[0..2]+"integer"[0]}
  // arguments = ["base" + "exponent"]
  fn powi(base: T, exponent: T) -> T;
}

then, my Dutch colleague could do cargo fmt --locale=NL to get:

impl<T> MijnStruct {
  // functienaam = {"macht"[0..2] + "heel_getal"[0]} -> "mach"
  // argumenten = ["grondtal" + "exponent"]
  fn mach(grondtal: T, exponent: T) -> T;
}

which would get translexicalized back to English on commit. It's kind of what https://github.com/ChimeraCoder/koro does, but also has huge benefits for consistent naming in English-only environments. The actual translation of identifiers could be offloaded to fluent-rs or something. I'll open an issue for this at fmt, that's maybe a better place?

Manishearth commented 8 months ago

@feefladder Again, this issue is not about localizing source code identifiers and comments. This is for localizing documentation and other resources.

Localizing source code is a worthwhile endeavor, but it is out of scope for this issue.