Parse transform for extracting strings

zuiderkwast commented 10 years ago

The other gettext repo https://github.com/etnt/gettext has a parse transform for extracting strings (i.e. the job of xgettext). This could be more or less easily ported to gettexter. Then, it would be fairly easy to add support for binaries ?_(<<"foo">>) and handle Erlang-specific string syntax better than xgettext, s.t. adjacent string literals ?_("string" "42") and odd Erlang-specific escape sequences ?_("\x{128525}") (the 😍 symbol).

We are willing to implement this. Hopefully you'll be interested to merge a pull request in the following days or weeks....

seriyps commented 10 years ago

Yeah, I'll merge this, while still think it's not very good idea, since it requires some ".po" file writer as well. But, at the same time, I have no better solution.

I think it should be implemented not as parse transform, but by walking a tree(s), generated by combination of http://www.erlang.org/doc/man/erl_scan.html and http://www.erlang.org/doc/man/erl_parse.html (which is pretty common to parse_transform, but much more flexible), and be accessible not only as library, but also from command line (eg, bin/exgettext escript or so).

Plus it should be possible to extract messages not only to .po file, but as plain list of erlang terms (see how it's done in erlydtl https://github.com/erlydtl/erlydtl/blob/master/src/i18n/sources_parser.erl - it return a list of #phrase{}, which then may be converted to .po file or may be written to database or anything else).

Maybe we should have 2 different modules: one for phrase extraction and one for generating .po file from list of phrases - this way it may be used in ErlyDTL too.

ghost commented 10 years ago

I've started to port the parse transform from etnt/gettext. It isn't well tested yet. I see what you mean that one should separate the compilation and extraction. But since all the parse_transform does is traversing the AST and dumping entries on the way it should be rather easy to adapt the code I've written so far to use with erl_scan/erl_parse. I'll check into that.

seriyps / gettexter

Parse transform for extracting strings #3