huggingface / audio-transformers-course

The Hugging Face Course on Transformers for Audio
Apache License 2.0
310 stars 96 forks source link

Translation to Russian #109

Open blademoon opened 1 year ago

blademoon commented 1 year ago

Hi there 👋

Let's translate the course to Russian so that the whole community can benefit from this resource 🌎!

Below are the chapters and files that need translating - let us know here if you'd like to translate any and we'll add your name to the list. Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue.

🙋 If you'd like others to help you with the translation, you can also post in our forums or tag @_lewtun on Twitter to gain some visibility.

Chapters

UNIT 0. WELCOME TO THE COURSE!

UNIT 1. WORKING WITH AUDIO DATA

UNIT 2. A GENTLE INTRODUCTION TO AUDIO APPLICATIONS

UNIT 3. TRANSFORMER ARCHITECTURES FOR AUDIO

UNIT 4. BUILD A MUSIC GENRE CLASSIFIER

UNIT 5. AUTOMATIC SPEECH RECOGNITION

UNIT 6. From text to speech

UNIT 7. Putting it all together

UNIT 8. Finish line

Course Events

Adding extra material:

blademoon commented 1 year ago

Good evening, I have added a file translation_agreements.txt in which I enter various words and phrases that need to be translated in a uniform way. This is necessary to maintain a uniform translation style in case there will be several translators.

blademoon commented 1 year ago

@lewtun Hello. Can you post a review? As soon as I finish translating Unit 1 I plan to push everything already translated into the official repository....

Lightmourne commented 1 year ago

Hi! I can take chapter 2 for translation.

blademoon commented 1 year ago

@Lightmourne Good news! The two of us can do more!

blademoon commented 1 year ago

@MKhalusova Good afternoon, Maria. Can you suggest who can do a review of our contribution to the repository?

MKhalusova commented 1 year ago

@blademoon Thanks for initiating and organizing this effort! I can handle the reviews.

blademoon commented 1 year ago

@MKhalusova That's great!

blademoon commented 1 year ago

@MKhalusova I have one question, will there be an example in the course on how to train Whisper for multilingual ASR? For example how to fine-tune Whisper for two languages - English and Russian? That would be very good.

MKhalusova commented 1 year ago

@blademoon Whisper is already multilingual, and you can further fine-tune on any language. In the course, we show how to fine-tune it on a language it wasn't trained on - Dhivehi. But you can apply the same principles to other languages.

blademoon commented 1 year ago

@MKhalusova Good afternoon. Yes, the Whisper fine-tuning notebook is something I've already looked into. I thought that there is some difference if you fine-tune Whisper for two languages at the same time. After all, at least each language needs its own tokenizer configured accordingly. But so far, I don't understand how to do it....

Lightmourne commented 1 year ago

Hi. Chapter 4 [BUILD A MUSIC GENRE CLASSIFIER] translated to Russian.

Lightmourne commented 1 year ago

@blademoon i take chapter 5 for translation [AUTOMATIC SPEECH RECOGNITION].

blademoon commented 1 year ago

@Lightmourne OK 😉

blademoon commented 1 year ago

@MKhalusova Good afternoon, Maria. The new part of the translation has been sent to PR https://github.com/huggingface/audio-transformers-course/pull/122

Besides, we have agreed with Sergey that after the translation is finished we will reread the whole course again (at the same time we will go through it to check everything) and make minor edits, this should improve the quality of our work.

blademoon commented 1 year ago

@Lightmourne @MKhalusova Good afternoon. I decided to add a new marker to our task list - MINOR_FIX_DONE. This marker will be used to mark those files that we have reread and corrected minor errors.

blademoon commented 1 year ago

@MKhalusova Good evening Maria, can you tell me what a "rainbow passage" is? Is it a book or?

MKhalusova commented 1 year ago

@blademoon The rainbow passage is a specific piece of text (this one) that is often used in English language speech and voice research to assess different aspects of speech. It includes a variety of phonetic sounds and linguistic patterns that can help researchers understand how speech sounds are produced by individuals with different accents or speech characteristics.

blademoon commented 1 year ago

@MKhalusova Thank you, I will add your clarification to the translated version. It will seriously simplify understanding.

blademoon commented 1 year ago

@MKhalusova In file "pre-trained_model.mdx" (Chapter 6):

## SpeechT5 

[SpeechT5](https://arxiv.org/abs/2110.07205) is a model published by Junyi Ao et al. from Microsoft that is capable of

Junyi Ao - is a person's name. But what this - et al? Could it be a typo?

blademoon commented 1 year ago

@MKhalusova In Catalan et al means "and others"))) But I don't know Catalan)))

MKhalusova commented 1 year ago

@blademoon et al. comes from Latin, and does mean "and others". It's often encountered in paper citations, and is very common in English: https://www.merriam-webster.com/dictionary/et%20al.

blademoon commented 1 year ago

@MKhalusova Good afternoon Maria, can you do a review of Sergei's PR https://github.com/huggingface/audio-transformers-course/pull/123 if possible. I am working hard on the translation of the last two units. Sergey is busy checking minor bugs and coordinating the translation of all our work. We are trying to bring the translation to the end and improve the quality. Thank you.

blademoon commented 1 year ago

@MKhalusova Good afternoon Maria. A small question about the translation of the title House-keeping. Almost all translations known so far are related to housekeeping and agriculture. But this is clearly not it. The context of the content of the section itself didn't help much in choosing synonyms either.... Can you explain the meaning of this word combination more precisely?

blademoon commented 1 year ago

@MKhalusova Good evening, Maria. For your convenience, my PR https://github.com/huggingface/audio-transformers-course/pull/124 should go after Sergey's PR https://github.com/huggingface/audio-transformers-course/pull/123.

blademoon commented 1 year ago

@Lightmourne When the PR is accepted and you're ready, tag me. We'll take the course and make minor adjustments to the translation. Okay?

blademoon commented 1 year ago

@MKhalusova Добрый день Мария ;) Можешь сориентировать, когда нам сюда вернуться чтобы "причесать" наш перевод) Получилось быстро перевести, но хотелось бы еще и согласовать по терминологии. Есть объективные недостатки и их нужно устранить.

MKhalusova commented 1 year ago

@blademoon Chapter 5 нужно починить (make style), затем можно мерджить. Completion of the stage of translation of the course into Russian #124 я посмотрю сегодня или завтра. Потом сверху отдельным PR можно добавить поправки после "причесания"

blademoon commented 1 year ago

@MKhalusova мы так и планировали. Сначала смерджить, затем просто мелкие правки.

Отдельный вопрос, курс по RL переводить будут?

blademoon commented 1 year ago

@MKhalusova по переводу одного предложения, погорячился, сделал лишний коммит. Ваш коммит делает тоже самое, его я тоже подтвердил. Просто видимо мне тоже нужно выспаться, а еще лучше в отпуск))) Спасибо вам за внимательность Мария!

blademoon commented 1 year ago

@Lightmourne Have a great birthday and thanks so much for your help! :cocktail:

MKhalusova commented 1 year ago

@blademoon PR с Chapter 5 смерджен. Почини, пожалуйста, конфликт в TOC в https://github.com/huggingface/audio-transformers-course/pull/124 и я посмотрю тогда на PR. RL пока никто не переводил, если хотите, можете взяться за перевод :) если будут какие-то сложности, пингуйте меня.

blademoon commented 1 year ago

@MKhalusova Добрый вечер Мария. Сделал.

Lightmourne commented 1 year ago

@Lightmourne Have a great birthday and thanks so much for your help! 🍸

@blademoon, thanks for the congratulations! :blush:

blademoon commented 1 year ago

@MKhalusova @Lightmourne

Good evening, Maria and Sergei.

1) Yesterday's additional materials have been translated. 2) Regarding the RL course, here is the link 3) Yesterday I started translating this course a bit, since there are no plans to translate it, I'm working in my repository for now. 4) Since I won't be able to translate and edit at the same time, Sergey agreed to take on the task of editing the already translated material. Many thanks to him.

blademoon commented 1 year ago

We need to add a translation from #137.

blademoon commented 1 year ago

137 Translated