Treeofsavior / EnglishTranslation

Tree of Savior Korean to English Translation OTC
322 stars 498 forks source link

System for marking lines contributed DURING CBT w/ Real Time Translation. #807

Closed vtange closed 9 years ago

vtange commented 9 years ago

Just like we had "x"'s and stuff for marking lines that have been worked on. I think we will need a system marking lines edited in-game. I'm pretty confident most of these lines will be accurate/usable in game since we did them knowing the context of the game. The worst thing that could happen to this project now is we add translators who had not played the game and they edit lines that were made during the CBT, sending us right back to square one.

Soukyuu commented 9 years ago

The problem with the x in front of the line was that the game failed to read the line because of that. We could put an x at the start of the message instead, but tbh that just looks ugly. If the lines were edited with the context of the game already, I think re-editing changing the meaning is not that likely to happen, unless people go back to Korean and re-translate.

Soukyuu commented 9 years ago

As mentioned in #790, currently $ is a better candidate, because it's not used by anything in the game. Advantage of my suggestion is that you can

imcgames commented 9 years ago

Alright we'll check into it and consider changing if there'd be no problem with it :)

Soukyuu commented 9 years ago

As pointed out by @Sourpusss, $ might be a problem if your script uses that to denote variables. I have searched through the current files, and it seems that § might be an alternative in that case (unless ToS will contain some sort of law books in the future)

imcgames commented 9 years ago

I think we can just use $ for now. We can place the $ at the beginning of the translation so that you can still see it in game with the $. We'll then just replace (tab)$ with (tab) for patches in game. We're also considering adding the symbol to the line code and making it possible for the translation to be read by client even with the symbol added to the line code. However, since we haven't finalized its feasibility and further developments regarding translations, what do you think of adding $ in front of the translation for now?

Soukyuu commented 9 years ago

Maybe marking unedited/untranslated lines instead would be better. Meaning, seeing a $ at the start of the translation means the line has not been finalized. This would make editing easier, because once you finished editing, you will see the line as it will appear in the final version. There might be cases otherwise, where linebreaks will shift because of the $.

ttgmichael commented 9 years ago

@imcgames It's a good idea, that'd really help make sure people know what has been editted and what hasn't.

@Soukyuu The issue of marking unedited lines is that you don't know what has been editted. It's still possible for someone to consider lines that someone editted as "unedited/untranslated". Marking lines that you clearly reviewed/editted is the surest way to know someone has at least touched up the line or at least reviewed it.

Soukyuu commented 9 years ago

@ttgmichael Uh, what I suggested is the inverse of the current system. If the current one works, so will the inverse of it. Else you have to explain to me why

works, but

does not. In the first case you add the $ prefix, in the second case you remove it.

ttgmichael commented 9 years ago

@Soukyuu : Because when we choose to mark "unedited lines", it can be hard to decide whether a line is uneditted or not, especially more complicated lines such as QUEST lines. Some people might not like how a line just sounds, and thus marks it down as "unedited/ needs editting". And also, it doesn't make sense to waste a line edit (in a normal PR, you should only have around 300 at most) marking down an uneditted line w/o editting it?

I'm not saying the system won't work, but some lines might be harder to "finalize" than others, thus there is simply a greater likelihood of marking confusion, imo, (and an additional step of marking those unedited lines):

  1. Person A removes the mark on lines he editted (assuming these marks already exist)
  2. But Person B comes along and, by how it sounds, marks it as "uneditted". This can be due to the vague line between translated and localized. Also remember, if you mark down a line, you shouldn't be editting it as it would cause confusion on whether the line is editted or not editted. Thus another person has to come in and check whether the line needs editting or not.
  3. Then Person A or C will need to reedit the line and remove the mark again. Though, it can also be possible that A or C simply remove the mark again w/o editting the line whatsoever, believing the line to be good already.

The current system is simpler and takes less line edits to get the point across (you mark lines you edit/reviewed as ok):

  1. Person A edits a line and marks the line
  2. Person B comes along, thinks the line is OK, keeps the line marked and moves on.
  3. Person C comes along, thinks the line is not OK, edits the line again, but maintains the first mark and can add another mark to label it as second revision. (with Discord, it'll be easier to discuss these editted lines that some still want to revise, and another difference is that Person C can make the choice with whether the line needs further editting).

tldr; the above case is for lines that are harder to edit (QUESTs lines). I think your system would definitely work on the simpler lines (it hinges on the idea that there will be no further revision of an editted line) as they are less likely to need multiple edits, but it's just an equivalent system compared to marking lines as you edit.

Soukyuu commented 9 years ago

My intention was to mark all translated lines by $ once. As soon as someone edits the line, the mark is removed. From that point on, it's no different than your suggestion. You're not describing both systems in the same way, the way I see mine is, using your words:

  1. Person A edits a line and removes the mark
  2. Person B comes along, thinks the line is OK, moves on.
  3. Person C comes along, thinks the line is not OK, edits the line again, but doesn't add the mark, because the line is already edited.

I see no reason to have PRs with lines that merely add a "needs editing" character - this is better discussed on discord, or in the PR if it's still not merged.

ttgmichael commented 9 years ago

I see, then you're talking about lines we have editted thus far? It's not about truly uneditted lines? What to do with those lines?

Doesn't anything after step 3 mean what we have right now? lines that don't have markings (but we don't know how much revision have been done on these lines. o-o

Me neither, but it'll mean there will be these "unedited PRs" to be pulled before editted PRs (and they may cause merge conflicts)

Soukyuu commented 9 years ago

I just don't see a difference between the current and my inverted system. Truly unedited lines will have a mark. Edited won't. Once the line has been edited once, there is no need to mark it in any way. That's the only change I'm proposing.

Merge conflicts will arise whenever people don't keep their branches up to date with master, no matter the system. In all other cases it should not matter whether you remove a mark on editing a line, or add it.

Sourpusss commented 9 years ago

I feel like adding the $ after the tab and replacing them later will work fine. Adding or subtracting to show a difference is completely arbitrary, except when you throw in the general population. I mean, we're all gonna know the rules and know what's going on, but what about everyone else scrambling to play the game? They see a $ at the start of a line? That's weird, I better delete it. Another one? I'll get rid of that too. They'll probably stop after a while, and that would only be an issue with the first few lines, but an issue nonetheless (remember why we started this system in the first place? Ironic, huh)? However, there might be the few rushers who just blaze through everything, then when they finally take the time to "help," they'll be "finalizing" random lines.

What I'm saying is that adding a completion symbol has much less room for ignorant / accidental error than a subtraction method. However, adding the symbol before the text is a great idea if we're going to keep the symbols in place during actual gameplay [testing].

f-tsang commented 9 years ago

If the files are in UTF-8, the checkmark symbol ✓ instead of a dollar sign $ could be an option.

Without any mention that the $ symbol is suppose to be an indicator for something, the average player is going to think that the $ isn't suppose to be there. A checkmark ✓ on the other hand, is a symbol most people would know about - though it's still quite likely someone will think it looks out of place.

Soukyuu commented 9 years ago

Adding or subtracting to show a difference is completely arbitrary, except when you throw in the general population.

No, because in case of removing the mark it

would make editing easier, because once you finished editing, you will see the line as it will appear in the final version. There might be cases otherwise, where linebreaks will shift because of the $.

It's not completely arbitrary if there is an advantage to a change.

I mean, we're all gonna know the rules and know what's going on, but what about everyone else scrambling to play the game? They see a $ at the start of a line? That's weird, I better delete it. Another one? I'll get rid of that too.

They won't, because

We'll then just replace (tab)$ with (tab) for patches in game.

Which means only people getting things from github and having activated the RTT (=those who know the rules) will ever see those tags.

Sourpusss commented 9 years ago

I was under the impression that the RTT was meant to be a basic feature of the beta client. Obviously, my argument is invalid if that's not the case, but if it is going to be available for everyone, then my point still stands. Also, IMC can just do the inverse of that in the "patching" process of the translation files.

The ✓ system makes sense, but if the RTT exists in a scenario described above, then we should still use the $, as it is much more accessible.

Once again, using a subtraction system would be fine if the RTT isn't completely public (and if the subtraction system is used, then we definitely shouldn't be using ✓'s). I'm all for a limited use system of the RTT.

Soukyuu commented 9 years ago

Even if the RTT will be a basic part of the client, everyone will have the unmarked files unless they manually get the marked ones from here. And if they do, they can be expected to know the rules when submitting PRs. If they waste their time by removing the mark, be it for the edited or unedited lines, it's their own fault for not reading the rules.

As such, I don't see the problem.

Sourpusss commented 9 years ago

Well, I was also under the assumption that the game files wouldn't differ from those on github at the time of the next beta launch. This is all information that needs to be confirmed by IMC. Also, if they remove the mark, they are actually wasting everyone's time / IMC's time (once again, the root reason of why we have this system in the first place).

f-tsang commented 9 years ago

On topic first,

I'm in favour of adding a symbol in front of the translation

As I had suggested earlier, we could use a checkmark ✓ instead of a dollar sign $ as the prefix. The $ symbol is a lot more convenient to type. The ✓ is difficult to type, but its meaning is obvious.

This is just a suggestion for the aesthetics of a symbol we may decide to use.


At this point, it largely depends on whether or not IMC has something in place to assist with marking edited lines.

We have two scenarios that should be considered:

  1. Is there a special symbol to use, not shown in-game?
  2. Will the in-game client directly use the files hosted on GitHub?

Is there a special symbol to use, not shown in-game? If yes, then this issue is resolved.

If not, next scenario.

Will the in-game client directly use the files hosted on GitHub? If yes, we'll need to determine what symbol to use to prevent confusion. Would be helpful if the symbol is easily understood by the other players.

This is the only case where we should be concerned about conflicts with the RTT.

If not, those in charge of merging the changes will know what to do

ttgmichael commented 9 years ago

Would this "unmarking" system help us distinguish lines that were edited during CBT? The reason for this thread was to get that across.

I would consider at least those lines from RTT be unmarked already since they are finalized (if not, at least editted to a point that's close to finalization).

The merit of the unmarking system is that we can essentially use any mark to mark "uneditted" lines, and remove them when we edit.

I still think the main challenge would be getting the "uneditted" lines marked still (to start the system). Which lines should be considered "uneditted"? Which lines should be unmarked? To make everything "uneditted" would mean any work from CBT may be changed w/o the help of the RTT.

Sourpusss commented 9 years ago

I'm just going to guess everything has been merged / reset since the whole legal thing that's been going on. That being said, since we'll have access to everything again in about 2 days, we'll hopefully be able to put a decent dent on marking already "complete" lines before the next phase of international testing. All the more reason for us to figure this out before we can start editing again.

Soukyuu commented 9 years ago

@Sourpusss you're either not reading whatever people are posting, or lack the ability to think logically. I'm going to reply to you one last time, hopefully this time it will be clear.

Well, I was also under the assumption that the game files wouldn't differ from those on github at the time of the next beta launch.

Minus the markings, because again, normal testers have nothing to do with the translation team. And if they do, they will get the marked files from the repo. The content (= translation text) will be the same as the git repo.

This is all information that needs to be confirmed by IMC.

They already said they will remove the markings for patch files in this very thread! Who do you think I was quoting? It makes no sense whatsoever to use marked files as default, and IMC sees it that way too.

Also, if they remove the mark, they are actually wasting everyone's time / IMC's time (once again, the root reason of why we have this system in the first place).

And this is why I think you are unable to think logically. They are not wasting our time removing the mark FOR THE NON-TRANSLATORS

We will still have the marked (using whatever system) files by getting them from the github repo. Can we stop going around in circles now?

@ftsang92 I understand why you would want to make the mark be harder to type, but I don't think it's going to be effective. All they'd have to do would copy the mark from another line. We don't even need that kind of protection, because to submit their version, they will have to make a PR - which means it will be screened by IMC and the team first. People re-editing the same lines multiple times will be filtered that way, unless their edits actually make sense.

Will the in-game client directly use the files hosted on GitHub? If yes, we'll need to determine what symbol to use to prevent confusion.

As IMC said, it will use github files, but minus the markings, so we don't need to.

@ttgmichael

Would this "unmarking" system help us distinguish lines that were edited during CBT? The reason for this thread was to get that across.

At the start of the beta, all translated lines are marked. During the beta, you unmark lines you edited. At the end of the beta all changes are merged. At the start of the next beta all lines will be marked again. That's one possibility.

You could also use a different mark for each beta, but I don't think it's necessary. Or you just (un) mark the lines once they have been edited with RTT as you say.

As for distinguishing non-RTT edits from RTT ones, I don't think we need a separate mark for that. The mark should be a finalizing mark. To get back to my original point, removing $ from a line means that it has been double checked in the context of the game using RTT. Everything before that should be considered preliminary edit and have a $ to indicate so.

ttgmichael commented 9 years ago

@Soukyuu Gotcha! That actaully sounds pretty good; unmarking lines with for RTT edits.

So that means we are using the $ or ✓ sign to mark editted lines until the second CBT (or OBT lol) and then we will unmark those lines when we can confirm are ok in game?

Sounds great as long as people are too and aren't confused by it. :+1:

Btw during the 1st CBT, all lines were already unmarked when you edit, so you never had to unmark anything. But since IMC is planning on making a mark that won't affect game text, we can do this.

Soukyuu commented 9 years ago

It's up to IMC which system to go ahead with in the end. Personally, after having a chat with Ragunaga just now on discord, I was pretty much converted to the viewpoint we might not need any marks at all. The reasoning being that just because someone has un/marked a line as "done" doesn't mean others shouldn't proofread it in the context of the game. Which makes marks in itself kind of... useless.

The only use I can see atm is to clearly indicate a line has not been edited, so that you are more easily alerted of that fact. Since theoretically, unedited lines have a higher chance of containing errors.

But if we go with markings, then I think my suggestion is going to be the least of a pain in the rear.

f-tsang commented 9 years ago

@Soukyuu The marks aren't "useless", per se, but do indicate that someone has confirmed that a particular line of text looks good in-game (ie. lines don't clip). At least, that's my reasoning for wanting a marking system.