Closed m-ou-se closed 5 years ago
Most of them are solved or have a workaround. The ones that are left aren't really solvable, since we can't turn off the Rust tokenizer. The list of issues that's left is now documented in the documentation: https://docs.rs/inline-python/0.2.0/inline_python/#syntax-issues
Literals
Strings
Escape sequences
Python and Rust string escape sequences are mostly the same. Differences:
\u1234
(Python) vs\u{1234}
(Rust) Missing in rust:\U12345678
\123
(octal) (except for\0
)\N{name}
\a
,\b
,\f
,\v
Single quoted
Not going to work, but shouldn't be a big problem.
Prefixes
u"..."
f"..."
Byte Strings
Rust has this as well. :tada:
b"..."
Raw strings
r"..."
Raw strings of this form are supported by Rust. Slightly different rules though: Escaped quotes are possible in Python:r"...\"..."
These are problematic: They would be parsed as
fr
(identifier) followed by a regular string literal. So invalid escape sequences are a problem. Possible workaround is a space before ther
:f r"..."
.fr"..."
,rf"..."
br"..."
,rb"..."
Long form
Long form strings accept multi-line contents in Python. Rust always accepts newlines in string literals. Minor difference: The handling of a
\
at the end of the line. In Python, that only removes the newline. In Rust, that removes the newline and any indentation on the next line. Update: The Rust tokenizer leaves whitespace after\<newline>
in tact! :tada:"""..."""
u"""..."""
(etc.)r"""..."""
,b"""..."""
These are a problem: This will be parsed as an empty raw string (r""
) or byte string (b""
), followed by two regular string literals ("..."
and""
). That's fine if the string doesn't contain any raw/byte-string-specific things (escape sequences, or non-utf8 bytes).Numbers
Numbers are exactly the same. Same prefixes, same digit seperator. :tada: (Rust allows some postfixes like
f64
, but that doesn't matter, as long as the Python syntax is a subset of what the Rust tokenizer accepts.)Imaginary literals
123j
Operators
The
//
and//=
operators are a problem, as those start comments in Rust.Comments
#
starts a comment in Python, but the Rust tokenizer will still tokenize everything after, rejecting Python comments that contain invalid Rust tokens. The solution is to use Rust comments (//
).