scelis / twine

Twine is a command line tool for managing your strings and their translations.
Other
840 stars 151 forks source link

Multiple line translations are not considered #204

Closed felipeplets closed 5 years ago

felipeplets commented 7 years ago

I have a multiple line string in my Localizable.strings files, once I run twine consume-localization-file twine.txt Localizable.strings --consume-all --consume-comments all the strings goes into the file but not the ones with multiple lines.

One example I have and works perfectly on iOS is:

"CONDITIONS" = "My conditions are the ones bellow:

Please read it carefully:"

What I will do is to convert all the line breaks to \n because they are few exceptions, but I think this is a Twine bug since should be no problem reading this string.

scelis commented 7 years ago

Thanks for the report. This seems like a bug, though one I would classify as relatively minor. I think most people tend to use \n in their Localizable.strings files.

sebastianludwig commented 7 years ago

The problem is that Apple#read uses io.gets to read the input line by line. Therefore fixing this issue probably requires to rewrite that method. That might be a good idea anyway, because it contains this monster of a regexp

/^\s*((?:"(?:[^"\\]|\\.)+")|(?:[^"\s=]+))\s*=\s*"((?:[^"\\]|\\.)*)"/

I'd like to still stick to regexps and not use Parslet or alike. Anyway, too much work for today, hopefully soon.

scelis commented 7 years ago

Parslet sounds kinda neat. I've strongly considered moving to a better parser that doesn't require relying on regular expressions but didn't know which library to pick and wasn't sure if it was completely worth the effort. If we did move to using something like Parslet, I would want to go with a library that would be easy to swap out later since dependencies like these seem to come and go.

sebastianludwig commented 7 years ago

Disclaimer: I've also only limited experience with Parslet (read: used it in one project). Anyway, the idea behind Parslet is, that:

  1. you write a Parslet parser to parse text files into basically nested Hashes/Arrays/Primitives
  2. use a Parslet Transform to transform these into your own AST
  3. that one then needs to be transformed into an object model (TwineFile, Section, etc)

For simple syntaxes and with a little abuse of Parslet step 2 and 3 can be merged.

Bottom line: It's a lot of code and complexity, even for simple parsers. It definitely raises the contribution barrier. The set of people that have worked with regexps is a lot bigger than the set of people that are willing to get into full blown parser development.

At the moment I don't see the necessity for this complexity to basically parse fancy versions of key = value. All* formatter's complete read methods are ≤ 30 lines (Tizen being the exception with ~40 lines). Compared to multiple classes spread over multiple files, the benefit of Parslet is not great enough (yet) in my opinion.

To improve the parsing I think we should break up the complexity (for example more regexps than huge one liners), add more inline documentation and maybe a method documentation block to explain the general approach. If we hit a maintainability limit there, then I'd agree that it's time to move to something like Parslet.

scelis commented 7 years ago

@sebastianludwig Thank you so much! I appreciate the thoughts and agree with your assessment. Let's stick with the current strategy for now.

scelis commented 5 years ago

Closing for now as "working as designed". Twine does not support multi-line apple strings files.