Closed mgeisler closed 10 months ago
Some crate like syntect
would probably work nicely.
Example: https://docs.rs/syntect/latest/syntect/parsing/struct.SyntaxSet.html#method.find_syntax_by_token.
Yeah, exactly! I'm hoping that we can use that library to get the byte position of string literals and comments. I don't have any experience with syntect, but I hope you can look at it.
I have tried to use syntect at #109. It seems to work fine.
I have tried to use syntect at #109. It seems to work fine.
That's awesome! It will probably reduce the line count in messages.po by a good margin.
As a further step after #75, we should offer an option to only extract literal strings and comments from the code.
For this example:
we would end up with just four small strings
"coin toss: {}"
"heads"
"tails"
"cash prize: {}"
in the POT file.
This would require us to process
Tag::CodeBlock
in a more fine-grained way, but I think it could be worth it.The fun part would be to find a cross-language solution. I suspect our best bet would be to use a syntax highlighting library: they normally detect strings and comments and so such a library should have the necessary machinery.