q191201771 / lalext

ffmpeg build script, lal website document, rfc document.
MIT License
55 stars 27 forks source link

An attempt at a full English translation of the website documentation #23

Closed GwynethLlewelyn closed 1 year ago

GwynethLlewelyn commented 1 year ago

Hello all!

My apologies for not writing in Chinese. In fact...

A long time ago I promised to attempt to do a translation of the website into English; I never forgot my promise, but I just couldn't find the time.

The missing pages

During the translation, I noticed that the lalext repository actually is incomplete: there are a few more pages on the website that are not included here. Those that I could find (through links on the other pages!) I've translated as well. There weren't many — perhaps half a dozen or so — but I'm concerned that the website might be out of sync with this repository. If that's the case, there might be some pages that have been already updated on the website, but not on the repository, and so the translation will be out of date...

What I mostly did was to make a full copy of all files under ./lal_website to a new directory, ./lal_website_english, and start translating them — one by one — without touching the original files.

Besides that, on the other directories, when appropriate, I also translated things like READMEs (or, rather, made a bilingual version of them). I wasn't very thorough, though, so I might have missed on file or two. Please let me know :-)

Automatically-generated pages?

Regarding the website... I understand that @q191201771 might have a tool to automatically generate most of the pages and push them to GitHub (using Hugo, perhaps?). This means that some essential procedures are very likely being made on the editing side — namely, generating tables of content, the sidebars, and so forth. I have no such tools here — and even if I had, I probably wouldn't be able to replicate @q191201771's development setup anyway! — therefore, all those items have been translated manually. I'm aware that, in many cases, the actual title on the page may be different from the title on the sidebar! This is just because I started with the sidebar and the table of contents, and, later, might have changed my mind about how to translate the title best — and forgot to update them accordingly. My suggestion, therefore, is to run whatever tool generates the indexes and tables of content on the translated pages, and use the generated indexes from there.

Changing links (from absolute to relative) wherever possible

Now, obviously, I needed to do some simple checking if the pages were being correctly rendered (the editor I use does nice previews!), and that also meant dealing with internal links. In almost all cases, when it was obvious that the link was 'internal' and just to another page in the same directory, I replaced the original absolute path with a relative path — which allows me to do the testing directly on the directory, without the need of setting up a webserver with a CMS, etc. I have no idea what is 'better' — absolute paths for the links, or relative ones? I'll let you decide :) If the choice is to use absolute paths, like on the Chinese version, then all the links have to be (manually?) replaced.

About my own limitations as a translator

A word on the translation itself, and a disclaimer:

As I've said before here on the Issues, I do not speak (or write!) any Chinese language at all. That means having to resort to machine translation as a starting point. When I first found the documentation for lalserver, I used Google Translate, which does a decent job of turning everything understandable for a non-Chinese speaker, even if the translation — for an English speaker! — might sound funny. Unfortunately, there is a limit on how 'deep' the Google Translator goes. For instance, for some reason, everything which is rendered using <pre> will be ignored by Google Translator — and this means that a lot of fundamental comments on the code (or on the JSON examples!), which are essential to understand what the parameters mean and what the options are, would not get translated automatically. Naturally enough, you can do the translations manually, by copying & pasting each line on Google Translator, see what it means in English, and copy it to the translated copy — all manually, as I actually did on https://github.com/q191201771/lal/issues/145.

This would take far too much time for me. Therefore, I used the German-made DeepL Translator instead. It has a much more limited range of languages, but it does its job well. On the free version, there is (like on Google Translator) a limit to how many characters you're allowed to translate per day. And DeepL is much better at understanding Markdown — Google doesn't even try! In many cases, I could simply copy & paste the output of a whole page, and just do minor fixes on the Markdown

Obviously, machine learning is not a solution; the results are often confusing, and even if it gets the grammar right, sometimes it translates sentences with a very twisted meaning which just barely makes any sense. That's where the human being steps in — to make the text more legible and accessible to an English speaker.

Because most of the lalserver documentation is written using a more technical language, the translations, in general, are not too bad. They still sound "machine-like" and need some polishing. However, all these machine translation tools fail to correctly translate a lighter, humorous, informal tone: the end result will unfortunately always sound as if a bored lawyer was writing the text (my apologies to any lawyer who might be reading this!). Here and there, I suspect that yoko's tone is much lighter than DeepL thinks it is, and, therefore, I humbly apologise in advance if I couldn't correctly capture some of that tone in the translated text. With luck, future revisions — hopefully made by someone equally fluent in Chinese and in English and able to understand the subtleties of both languages better! — might improve the current translation and make it closer (in tone) to the Chinese original!

And don't even ask me about Chinese Internet jargon. It took me a whole afternoon to figure out what (manual dog head) meant! 🤣 But I learned a bit of Chinese Internet jargon that way, found it extremely amusing, and will certainly read more on that fascinating subject 😉 (tip to native English speakers: that expression is used to denote sarcasm, but its origin requires a quite long explanation — feel free to search for it on Google 😀 ).

Anyway... there is another tool I used from DeepL — which is independent from the machine translation — known as DeepL Write, currently in Beta, and which attempts to polish your grammar (only English and German for now). Depending on the context, it might produce a more reasonable text; it served me sometimes when the translation was so convoluted that, although I could understand what was meant, it was clearly not the best way to express it! Most of the time, however, there really was no "tool" to help me, and I attempted to do my best, but I'm aware I wasn't necessarily very consistent all the time. After all, in English, the website is about 30,000 words long (!) — even taking into account that much of it is code or JSON or tables, well, these need also some proper formatting here and there. And I'm sure there is still much to be done!

My opinionated choices

In many cases, especially those where I didn't have access to the original Markdown file (those that have not been committed to the repository), I had nothing to work with — except the page rendered to HTML. That meant working backwards — from the fully-rendered page in HTML to a simple(r) representation in Markdown — and sometimes it became necessary to make decisions about the design elements. Naturally enough, those missing pages were inspired on the existing ones, but there might be some differences here and there.

The same criterium was also applied to certain choices of words, expressions, or even acronyms. For example: I've tried to change all technical acronyms to all caps; every time I had some doubts about a particular choice, I would Google for the correct way of representing such acronyms, and in almost all cases, the references pointed to uppercase versions of the acronyms. Therefore, on the English translation, you will see RTMP/RTSP, HLC, H.264, etc., always in caps (unless I forgot to change one or two!), but when referring to actual code or data structures, I've used whatever @q191201771 used.

What is still missing

Unfortunately, all the images on the website have notes here and there in Chinese. Because these are raster images (and not vectorial), I'm afraid I couldn't translate them, at least not for the time being (there are a few scanners that might be able to recognise Chinese characters and I may try these out in a future revision). If anyone still has the original images, on whatever format they may be, please add them to the repository, if possible. That way, I might be able to isolate the text layers, translate it, and generate new images with the labels in English.

If the original files are not available (for any reason), the solution is to make them again, from scratch. Fortunately, most are diagrams (perhaps made on Draw.io?...) that can be easily re-done from scratch.

The same, in fact, applies to the logo. It's a very simple design which can very easily be converted into vectorial format (namely, SVG); I'm happy to do that, I just would like to know the name of the font used to write "Live And Live" beneath the logo itself; if someone knows that, please let me know; otherwise, I can always try one of those fancy font recognition tools to see what font it suggests.

Translation license

Except where noted otherwise, I release all my intellectual property on these translations into the public domain, so long as they are used in the context of the lal/lalserver/lalext projects.

q191201771 commented 1 year ago

I respect your work and the perseverance you have shown. Thank you for your love and support for LaL.

I apologize for the missing documents. I will check whether the missing documents contain sensitive content that is not suitable for public release during my free time and will publish them accordingly.

docsify I've used this tool to convert markdown files to websites.

GwynethLlewelyn commented 1 year ago

My pleasure! You know I'm a big fan of lalserver, and wish it to become more and more used :-) Hopefully, even a not-so-good translation into English encourages new users to use it as well.

I hope that those missing documents do not have any sensitive content!!! It's just because I have translated them as well, or, at least, all the extra pages I could find. All I can say is that they looked harmless enough to me, but please re-check what was translated that shouldn't be!

Also, thanks for the link to docsify, I'm always learning about new tools :) I hope that somehow you can run docsify on the translated documents as well. I've read the docsify manual only briefly, and it seems that the directory structure should be different, i.e. lal_website_english should be a subdirectory of lal_website — at least, that's what it looks like on the structure shown on the docsify manual. Let me know if you wish me to change that, or if it's not really needed.

q191201771 commented 1 year ago

I'll start by adding a link on the documentation site pointing to the lalext github repo's English documentation directory. Later, when I have time, I'll actually add the English documentation to the documentation site, including the documentation site's directory.

Thanks again.

Also, DeepL is a great site that I used to translate this reply, thanks for the recommendation.