Bringing tiny general docs fixes to all localisations in umbrella PRs

shurup commented 1 year ago

This is a Feature Request

What would you like to be added Sometimes we bring tiny corrections in the original (English) docs, such as fixing an outdated link. The current workflow is to merge them there, and things seem to be done. It's only later and quite occasionally/randomly when such updates get to other languages one by one… and they probably, never land everywhere.

E.g.:

This is how we end up with:
- 38534,
- 38543,
- 38585,
- 38594.
This case: #38862 — is similar but a bit more tricky. It's not just a link, but also requires having dates written in the relevant language, so this approach should not include such kind of changes.

For me, it sounds more reasonable to a) get the change approved for the English version first and b) immediately after it's approved, bring it to all other localisation at once. We can do it in two PRs (one for English and another one for all other languages together) or maybe even in one, so one actual change will be presented in the only place.

Why is this needed This approach will help us:

Reduce the number of small-sized, identical PRs and people involved in creating/reviewing them.
Make such documentation fixes more consistent throughout all localisations available.

Comments The obvious drawback is that each localisation team might miss such changes getting into their localisation. However, if the change is so minor and doesn't affect the language itself (just an URL or some specific English term), it doesn't sound like an issue to me. Maybe, a simple way to notify all localisation teams' owners in such "umbrella PRs" will be helpful, though.

k8s-ci-robot commented 1 year ago

@shurup: This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

sftim commented 1 year ago

You'll need to get every localization team to agree to change their workflow to match what you're proposing here, @shurup. If we can do that, then this change is easy to make.

a-mccarthy commented 1 year ago

@shurup, exactly what @sftim said. To my mind, the main issue here is visibility and reviewing. Localizations are set up so only approved language reviewers are able to review/approve on content and merge PRs. This makes sure that folks within the localizations teams know about the changes and have the ability to raise any concerns about the change before the PR is merged into the docs.

In the case of an umbrella PR like you are describing, I think we need to figure out

Who is the reviewer/approver on the PR? If its multi-language, which language gets assigned to review by the bot?
Does the PR bot allow a reviewer of another language to review/approve PRs for different languages?

I think this is something worth investigating more to get feedback from other localizations, @shurup are you able to make it to the next localization meeting February 6th to talk more about this topic? We can also chat more async in this issue (and slack) :)

/area localization

tengqm commented 1 year ago

In the case of Chinese localization, we treat each page a minimum unit for synchronization from en to zh-cn. Every time we update a page, we require a full resync of the whole page. We check if a page is out-of-sync by comparing the date of the latest commit. The detection is performed using the scripts/lsync.sh tool. When a minor change A is merged to an English page content/en/docs/foo/bar.md, we need to check ALL the changes before the content/zh-cn/docs/foo/bar.md was last updated. In many cases, there could be other changes B and C merged to the English version, before A is merged. Next time when a PR is proposed to content/zh-cn/docs/foo/bar.md, we need to make sure that PR has incorporated A, B, and C. We will reject "partial update PRs", .i.e. PRs that only synchronize change A.

divya-mohan0209 commented 1 year ago

I agree with all the valid points Abbie, Qiming, and Time made. Adding my two cents below,

As a co-chair: I'd love for this to happen to reduce the number of incoming PRs. The only challenge is gaining consensus from all the localization leads.
As a localization approver/reviewer: The review/approve cadence, processes, and content consistency across localizations differ. In the case of trivial umbrella PRs, one of the manual overheads I foresee is the content consistency part. Every time we open such a PR, there will be a need to coordinate and figure out the localizations for which the file is present. If it is not present, we can, of course, leave it as-is. But suppose we explore the route of having the file created & associated content updated to improve consistency across localizations. In that case, we will need to discuss this in further detail. Additionally, as @a-mccarthy correctly pointed out, we will need approval after all the localization leads sign off. Discussing who would be the POCs to do that might be of merit too.

I recommend we bring this up in the next localization meeting and discuss the specifics. If all localization leads are in agreement, we could move this ahead.

a-mccarthy commented 1 year ago

We chatted about this a bit in our last localization meeting. Two concerns that were raised were

Some localization teams prefer to resolve these kinds of issues within their localization using the same processes they use for all other changes
The scope of these kinds of PRs could not include any word changes. If the PR contained any translated words, then it's best to keep the separate PR for each localization model that we have now to help make sure that translations are valid.

Based on the majority of the feedback we have gotten on this issue, I'm thinking that this is not something we will be adopting because it introduces too many non-trivial changes to established processes for what we would be gaining.

I'm happy to continue chatting about this if folks have different thoughts :)

shurup commented 1 year ago

First of all, I am incredibly sorry I brought this issue to wide attention and immediately disappeared :sweat_smile:

Many thanks to everyone for your input; it was super helpful! Based on it, I can see this idea doesn't seem to be reasonable enough. So I'm happy to close this issue with a better understanding of why we are where we are. Thank you for summing it all up, @a-mccarthy!

I also like very much the idea of full resyncs that is used by the Chinese team. Such practices are definitely worth sharing with other teams. It's quite off the original topic, but is the localisation experience of different teams documented anywhere? I know some issues are discussed during the meetings and in Slack, but having some kind of "localisation best practices knowledge base" would be amazing. I think it would be overkill for the existing https://kubernetes.io/docs/contribute/localization/ page since it focuses on more general things. Maybe we can start by creating one more "child" page to accommodate the approaches and techniques different localisation teams use to deal with different tasks? Not as a guideline, but as an existing experience you might consider trying in your team. Or maybe there is a better way to do it, and we'd better discuss it elsewhere.

tengqm commented 1 year ago

There is a sig-docs-localizations Slack channel, and there is a community meeting for localization teams to gather and share experiences. However, some of the practices are not applicable to all teams because the practices are closely related to how the team tracks changes. For smaller/inactive teams, there could be a dedicated branch for batch translation/syncs. For larger/active teams, the localization may be even tracking the latest changes to the English content. For this and other reasons, each team may have their own protocols on managing terminology, content style etc. For example, the scripts/lsync.sh tool is heavily used by the Chinese localization team because it is closely tracking the latest changes, based on the timestamp of the last commit. It can handle different languages, but it doesn't mean it should be recommended to teams which are working on an older release that is behind the current one.

shurup commented 1 year ago

Thanks, @tengqm! I am aware of Slack and meetings. My idea was to create very basic documentation giving a brief introduction to those practices as a bit more formal way to share the experience. It could have become a starting point for those who are looking for ways to establish or improve their own workflows. Not even as a recommendation but just as an option to consider — it can be adopted/modified or not used at all, of course.

a-mccarthy commented 1 year ago

@shurup Thanks for raising this issue! I think we've had a lot of valuable discussions around this topic and it brought up some great points on different localization's processes.

I agree that sharing the ideas and processes from different teams can be very useful and it's one of the main reasons why the localization subproject was formed :) I'm not sure that putting them into the main kubernetes website is the best place, though, because there are a lot of variables to consider and reasons why a localization has adopted certain practices (as @tengqm pointed out). The website contribute section is usually a place where we describe set or prescribed processes everyone should follow.

These might be better as blog posts, where a localization can share their experiences and practices. Or some other place that talks about the sig practices, like the kubernetes/community repo, https://github.com/kubernetes/community/tree/master/sig-docs.

In terms of this issue, I think we are ok to close it. We can continue to chat about processes in Slack and in the meetings, and figure out the best way and place to collect these ideas as well

shurup commented 1 year ago

Thanks, Abigail! I'm closing the issue.

The kubernetes/community repo seems to be the right place for what I'm suggesting. Will investigate it more thoroughly a bit later.

sftim commented 1 year ago

If a change is very high value to the Kubernetes project, I hope we'd consider making an exception to our usual policy and consider having a PR that touches multiple localizations. This should not be the norm; rather, it'd be something we save for rare and important cases such as a high-impact security notice.

tengqm commented 1 year ago

If a change is very high value to the Kubernetes project, I hope we'd consider making an exception to our usual policy and consider having a PR that touches multiple localizations. This should not be the norm; rather, it'd be something we save for rare and important cases such as a high-impact security notice.

Good point.

MaxymVlasov commented 1 year ago

Okay, lang-uk team is okay with that. Better to fix git conflicts, than have outdated docs

kubernetes / website

Bringing tiny general docs fixes to all localisations in umbrella PRs #38965

38534,

38543,

38585,

38594.