Open ni-balexand opened 1 year ago
hi -
why would you not simply fix the file to have proper CR/NL combinations? (edit: OK I thought that was the input file, it's the output)
this part:
assert rendered == b'Line 1\r\r\nLine 2\r\r\nLine 3\r\r\n'
looks wrong?
I'd really like to use Python's text mode here and not go back to a Python 2-ish approach.
I guess that written file is the bug. I'm not really understanding the problem yet. I dont want to write "b" at all.
Mike Bayer has proposed a fix for this issue in the main branch:
switch to textmode in all cases https://gerrit.sqlalchemy.org/c/sqlalchemy/mako/+/4450
see above gerrit. does not work yet, as we have some super legacy BOM stuff I'd have to figure out and/or just remove.
the "reading bytes" thing is strictly python 2. if it's causing problems, it's out. I want to use the most modern Python features
OK I understand the issue now. here is the fix:
diff --git a/mako/cmd.py b/mako/cmd.py
index 93a6733..deb8b95 100755
--- a/mako/cmd.py
+++ b/mako/cmd.py
@@ -86,12 +86,12 @@ def cmdline(argv=None):
kw = dict(varsplit(var) for var in options.var)
try:
- rendered = template.render(**kw)
+ rendered = template.render_unicode(**kw)
except:
_exit()
else:
if output_file:
- open(output_file, "wt", encoding=output_encoding).write(rendered)
+ open(output_file, "wt", newline="", encoding=output_encoding).write(rendered)
else:
sys.stdout.write(rendered)
we have no need for text mode to convert newlines so we turn it off.
can you confirm this fixes all issues on your end? thanks
Yes, that fixes the issue.
I get the confusion now. Opening as text vs binary has several effects including:
Opening in text mode with newline="" will turn off newline conversion while leaving the character encoding alone, and continuing to use the str type for read/write operations. It looks like render_unicode will return a str type just as render did.
I don't see any specific reason why we would need to change render to render_unicode as far as the newline problem I'm looking at (reverting back to just render still looks fine).
Do you think changing to render_unicode is necessary for other string encoding reasons?
render() returns either bytes or string while render_unicode always returns string.
right it's not strictly related but the way it is is technically wrong. eventually typing will be added here and it will change anyway
The issue here can be demonstrated by running the following script on Windows:
It's odd that read_file() in util.py opens / reads in binary mode but then cmdline() in cmd.py opens / writes in text mode. The whole point of using text mode should be to convert EOLs on Windows, but it does not do this correctly.