Closed maufl closed 8 years ago
@maufl can you please paste some examples of the msgids extracted from jsxgettext? Maybe we can provide some directory whitelisting. This way we won't remove msgids if they belong to a certain reference.
Currently it looks like this
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Report-Msgid-Bugs-To: \n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"POT-Creation-Date: 2016-03-30 12:35+0000\n"
#: 0:91
msgid "Column 1"
msgstr ""
#: 0:96
msgid "Column 2"
msgid_plural "Columns 2"
msgstr[0] ""
msgstr[1] ""
But I opend a pull request in the loader repo that should fix the file names, e.g. instead of 0
it will be web/static/js/react_test.js
.
Directory whitelisting sounds like a good idea to me.
The change landed in the loader and now the file looks like this:
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Report-Msgid-Bugs-To: \n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"POT-Creation-Date: 2016-04-04 07:03+0000\n"
#: web/static/js/react_test.js:94
msgid "Buy envelope"
msgid_plural "Buy envelopes"
msgstr[0] ""
msgstr[1] ""
#: web/static/js/react_test.js:95
msgid "First Column"
msgstr ""
#: web/static/js/react_test.js:100
msgid "Second Column"
msgstr ""
Awesome! So I think we could support some sort of whitelisting where you would tell it to not touch translations from "web/static/js". Would you be interested in trying a patch? :)
I'm already on it. I thought about on which level white/black listing could happen. We could either exclude pot files/domains while extracting based on a pattern, or we could keep translations based on references. Do you know which solution you would prefer? I have a working patch for the first approach but I could write a patch for the second approach too.
Both are fine to me. Excluding domains is easier for sure if you guarantee to keep the JS translations in separate domains. Let's wait and see @whatyouhide thinks about this.
In my opinion, one of the problems here could be that when extracting new translations, we read all translations from all POT files _in whatever subdir of each backend's :priv_dir
directory_ (as can be seen here):
# Returns all the .pot files for each of the given `backends`.
defp pot_files_for_backends(backends) do
Enum.flat_map backends, fn backend ->
backend.__gettext__(:priv)
|> Path.join("**/*.pot")
|> Path.wildcard()
end
end
I think one step in the right direction could be to, as a starter, change that wildcard to "*.pot"
: we only support POT files directly in the :priv_dir
directory, not in subdirectories (no slashes in domain names, see #76 as well).
This could work towards solving this issue as well, if the JS translations are simply stuffed in, e.g., priv/gettext/frontend/default.pot
(and priv/gettext/frontend/en_US/LC_MESSAGES/default.po
in turn). @maufl would this be a feasible solution? We can still support black- or whitelisting, of course, but I think this step is a good step either way.
I'm not sure. I just tried this and it seems that currently, each directory under :priv_dir
is expected to be a locale by the gettext.merge
task. So putting frontend files into a separate folder would require checking whether the folder contains translations for a locale.
I could put the frontend translations into a completely different folder, but then I would have to run merging twice. My goal was managing both translations together, e.g. running mix gettext.merge
will merge all translations, including those from the frontend. I'm not sure if this can be achieved without having both pot files in the same dir.
@maufl yes, you're right. Gettext (gettext.merge
in particular) still assumes every directory under :priv_dir
is a locale.
So, I don't particularly like either of the proposed approaches but yet I can't think of anything better; excluding domains should definitely be easier to implement, but on the other hand the user is required to stuff all the frontend (or whatever scope) translations in a domain. Of course, you can have frontend_DOMAIN.pot
but it's still somehow ugly. On the other hand, this has the benefit of not making Elixir's Gettext have to deal with translations coming from different sources in the same POT file.
Maybe keeping translations based on the reference pattern could be good: when merging, when check what references point to protected paths and we never purge those translations. Somehow however, this still feels dirty to me as the logic behind purging, merging and friends is already somewhat convoluted.
Last proposal, directed mainly to @josevalim: when working on this, we decided the way to tell if a translation comes from Elixir or from the user would be the presence of one or more reference #:
comments; it was the easiest and most straightforward solution at the time. However, I proposed something else as well: we identify Elixir translations with a elixir
flag (in a #,
comment, more here). If we do this, we don't have to change anything else I think because JS translations will not have the elixir
flag. Definitely a lot more noise in PO/POT files, but probably more straightforward behaviour. Wdyt?
I like your last proposal.
I thought about white-/blacklisting domains or reference paths, and I think it would work if there is only one backend, which is probably the case most of the time. As soon as we have multiple backends, which is a supported case, it's unclear were the configuration for white-/blacklists should go. Especially in the references case, because right now there is no information available at the point when translations are merged, only when pot files a read.
I am not a fan of the last proposal because of the noise. :( I wouldn't worry about multiple backends because why would you also spread the javascript related translations across backends?
If we go with blacklisting/whitelisting, we can configure it in the :gettext
application I think, so that it applies to all backends.
I am fine with per backend or for the :gettext app.
I proposed to configure it for :gettext
because I have a hard time imagining users wanting something blacklisted from a backend, but not from another backend; if Gettext for Elixir is not handling you frontend translations, then it's not a decision at the backend level :). Does it make sense? If so, @maufl wdyt?
Yes, I think that's a good idea. In this case I would go for reference white-/blacklisting, so the user is free to use the same domain in frontend and backend.
I can take a shot at implementing it. :)
@maufl yep, let's go with this for now:
when merging, when check what references point to protected paths and we never purge those translations
Of course, please do take a shot at implementing it, it would be awesome :smiley: Let me know if you find any bump in the road, especially if it's caused by some code being not-so-clear :).
How would you name this configuration option? I'm not sure how to best express the intention.
(protect|exclude)_(refs)?_from_purging
I think this patch might suffice. Can you take a look at the code and let me know if I forgot about something? If the code is OK, I will add tests and documentation and open a pull request (or I can do this now).
There where some bumps :) Sometimes it's not easy to follow the code, but I can not point my finger at any particular problem. Also, I'm quite new to Elixir (~2 months in), so it might just be me not being used to functional programming.
@maufl you can open a PR so I can review the changes if you want. :)
Now I have a test and documentation. I opened a pull request.
:heart: :green_heart: :blue_heart: :yellow_heart: :purple_heart:
This landed in #86, so closing this.
I read through the issues and found that it's a feature, but unfortunately not for me. When running
mix gettext.extract
all msgids from POT files that have a reference are deleted if they are not present in the extracted msgids.I'm trying to extract gettext msgids from React components that we use in our Phoenix project. Phoenix uses Webpack to bundle frontend assets and I found this handy loader that will extract msgids from Js(x) files when they are loaded via webpack. All the msgids are dumped to a new POT file in priv/gettext (because I configured it so) and they all have references. Now when I extract msgid using mix, the POT file is emptied :(
I can't change the behaviour of the loader without patching jsxgettext and jsxgettext-loader. I could put the POT file in another directory, but right now the workflow for Phoenix and Reactis perfectly in sync and I would like to keep it that way.
Is there a possibility to force (Elixir) gettext to not delete msgids from a POT file?