adams85 / aspnetskeleton2

A foundation for building robust web applications on ASP.NET Core.
MIT License
8 stars 4 forks source link

Extractor: how to add context to POFile #3

Open nkosi23 opened 2 years ago

nkosi23 commented 2 years ago

Hello,

Thank you again for this neat project! When I used the (much more limited) po implementation from the Orchard project, one thing I found very practical was that the message context of every translatable string was the source code surrounding the call to the translation component. For example, if a source file contained:

<div>
    <h1 class="display-4">@T["Welcome"]</h1>
    <p>Learn about <a href="https://docs.microsoft.com/aspnet/core">building Web apps with ASP.NET Core</a>.</p>
</div>

The resulting po file would contain something like:

#| msgctxt "<h1 class="display-4">@T["Welcome"]</h1>"
#| msgid "Welcome"
msgid "Welcome"
msgstr "Bienvenue"

And if I remember correctly it even contained the preceding line and the next line. This additional context is most of the time very useful to determine what would be the best translation in a foreign language.

Could you please point me to the place that I would need to modify in the source code to have a similar behavior? Right now, the message context is set to the path of the file which is much less useful. Another problem is that an absolute path is specific to the computer of the developer running the extractor tool, therefore this would not play well in a team context if several developers need to add translatable strings. For many reasons, setting the context to the source code seems to be the way to go.

adams85 commented 2 years ago

Hi there!

I'm glad you find the project useful (despite the lack of documentation)!

Technically, the source code surrounding the call in your example is not a message context in PO terms but a comment (a previous value comment, more precisely). This is an important distinction because comments have no effect on translation lookup (they're just extra information stored along with the entry for translators, tools, etc.) but message context is a part of the PO entry key, that is, its value is used in the equality check when looking up the translation for a specific key. In other words, message context is for distinguishing PO entries with identical message IDs.

So, if your goal is to just add some extra information as some kind of comment, that's relatively easy to implement: you'll just need to modify the extractor tool. Start with looking around in this class. Then you need to figure out how to get the surrounding code from Roslyn. Once you get that done you can include it in the returned LocalizableTextInfo objects and, as a final step, add it as comments to the generated entries by modifying the PO catalog building logic around here.

However, if you want to add surrounding code as actual message contexts, I can't see an easy way to achieve that. The first part of it would be the same as I described above, except for adding message contexts instead of comments to the catalog entries. The second part would be inventing some magic on the lookup side, which is able to figure out the source code context of the executing code at run-time. And that looks like a pretty tough nut to crack, TBH. It may be doable using source generators though.

One more thing worth mentioning if you really want to go down this path: the implementation used by this template project allows you to pass message context to the lookup logic like this: T["Error", TextContext.From("Pages")], where TextContext.From must be the last argument.

Hope this helps.

nkosi23 commented 2 years ago

Thanks a lot for these insights! I was indeed confused, what I am really after are comments (so that they show up in the comments section of the POedit application). Your pointers are exactly what I needed to get the ball rolling, thank you so much for that πŸ˜ƒ

I leave this issue open so that I Can Come back here with samples of my monkey patch in case it is useful to someone else.

Unfortunately sending a PR would be way over my head, I am not familiar at all with git's tooling and workflow and am overwhelmed with work so I wouldn't be able to add learning git to my plate I'm afraid. All I'm able to do is forking and downloading zips from Github as far as git is concerned πŸ˜† My daily driver is using Mercurial using the GUI TortoiseHG.

adams85 commented 2 years ago

I leave this issue open so that I Can Come back here with samples of my monkey patch in case it is useful to someone else.

Good idea! πŸ‘

Unfortunately sending a PR would be way over my head

Once you have something working, I'll consider including it in the project (probably as an optional feature). If so, I'll handle the git-related stuff as well.

I am not familiar at all with git's tooling and workflow and am overwhelmed with work so I wouldn't be able to add learning git to my plate

I recommend learning it eventually because of its ubiquity (and because it's a truly capable tool). The basics are not complicated at all once you get the hang of its distributed nature (i.e. the concept of remote repos and local working copies). But interestingly, you don't need a remote server for using it - which was one of the biggest selling points for me, coming from SVN...

My daily driver is using Mercurial using the GUI TortoiseHG.

There's Tortoise software for Git as well! ;)

nkosi23 commented 2 years ago

Okay I got it working! :smiley: Here is what I've done:

I have added the following methods to CSharpTextExtractor.cs as well as an additional field:


private List<string> _linesInFile = new List<string>();

protected void SetSourceCodeLines(string content)
{
    var lines = content.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries) ?? new string[] { };
    _linesInFile = new List<string>(lines);
}

protected virtual string GetCode(string content, CancellationToken cancellationToken)
{
    SetSourceCodeLines(content);
    return content;
}

private string GetSurroundingSourceLines(int lineNumber)
{
    //TODO: Add to CSharpTextExtractorSettings
    int numberOfLinesBefore = 2;
    int numberOfLinesAfter = 2;

    var startLine = lineNumber - numberOfLinesBefore;
    startLine = startLine < 0 ? 0 : startLine;

    var endLine = lineNumber + numberOfLinesAfter;
    endLine = endLine > _linesInFile.Count ? _linesInFile.Count : endLine;

    if (startLine == endLine)
        return _linesInFile.ElementAt(lineNumber - 1);

    var sb = new StringBuilder();
    for (int i = startLine - 1; i < endLine; i++)
    {
        //The comment sign is already added to the first line
        if (i == startLine - 1)
        {
            sb.AppendLine($"{_linesInFile.ElementAt(i)}");
            continue;
        }

        sb.AppendLine($"#. {_linesInFile.ElementAt(i)}");
    }

    return sb.ToString();
}

In the same file I have also modified the methods AnalyzeDecoratedDeclaration and AnalyzeElementAccessExpressions to set the Comment property of LocalizableTextInfo:

return new LocalizableTextInfo
{
    LineNumber = lineNumber,
    Id = id,
    PluralId = GetPluralId(argList),
    ContextId = GetContextId(argList),
    Comment = GetSurroundingSourceLines(lineNumber) //This is what I have added
};

We must also modify CSharpRazorTextExtractor to ensure it makes the call to SetSourceCodeLines:

protected override string GetCode(string content, CancellationToken cancellationToken)
{
    SetSourceCodeLines(content); //This is the line I have added

    var sourceDocument = RazorSourceDocument.Create(content, "_");
    var codeDocument = _projectEngine.Process(sourceDocument, fileKind: null, importSources: null, tagHelpers: null);
    var parsedDocument = codeDocument.GetCSharpDocument();
    var errorDiagnostic = parsedDocument.Diagnostics.OfType<RazorDiagnostic>().FirstOrDefault(d => d.Severity == RazorDiagnosticSeverity.Error);
    if (errorDiagnostic != null)
        throw new ArgumentException($"Razor code has errors: {errorDiagnostic}.", nameof(content));

    return parsedDocument.GeneratedCode;
}

Now POEdit displays the source code lines like I wanted :smiley:

image

One remark is that the field I have added makes the extractor even less thread safe than before.