go-python / gpython

gpython is a python interpreter written in go "batteries not included"
BSD 3-Clause "New" or "Revised" License
870 stars 95 forks source link

parser: fix CRLF(\r\n) file parsing error, SyntaxError: 'invalid syntax' #219

Closed wetor closed 1 year ago

wetor commented 1 year ago

The python file on the Windows platform may be in CRLF format. For example, git may convert the file to CRLF under Windows, resulting in parsing failure

ncw commented 1 year ago

The fix looks reasonable to me.

Can you link some docs showing .py files can end CR LF please?

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (337df2a) 74.41% compared to head (42f20b7) 74.42%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #219 +/- ## ======================================= Coverage 74.41% 74.42% ======================================= Files 76 76 Lines 12673 12675 +2 ======================================= + Hits 9431 9433 +2 Misses 2567 2567 Partials 675 675 ``` | [Impacted Files](https://codecov.io/gh/go-python/gpython/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=go-python) | Coverage Δ | | |---|---|---| | [parser/lexer.go](https://codecov.io/gh/go-python/gpython/pull/219?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=go-python#diff-cGFyc2VyL2xleGVyLmdv) | `89.07% <100.00%> (+0.04%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=go-python). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=go-python)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

wetor commented 1 year ago

here https://docs.python.org/3.5/reference/lexical_analysis.html of course, we can also use "build tag" to do special processing on the Windows platform

ncw commented 1 year ago

The docs you linked are quite clear

A physical line is a sequence of characters terminated by an end-of-line sequence. In source files, any of the standard platform line termination sequences can be used - the Unix form using ASCII LF (linefeed), the Windows form using the ASCII sequence CR LF (return followed by linefeed), or the old Macintosh form using the ASCII CR (return) character. All of these forms can be used equally, regardless of platform.

So this should be done on any platform, so we don't need build tags.

Also, theoretically, we should be looking at CR terminated lines. However that is a bigger change to the lexer, so let's not do that until someone requests it as I don't think macs have used CR terminated files for a very long time.

I'll merge this now - thank you :-)