lcompilers / lpython

Python compiler
https://lpython.org/
Other
1.37k stars 157 forks source link

Add `str.isspace()` method #2586

Closed kmr-srbh closed 3 months ago

kmr-srbh commented 4 months ago

Fixes #2490.

2490 gave wrong output due to the incorrect handling of the escape sequence '\f'. Escaping the sequence was handled, but unescaping was missed. This led to the string " \t\n\v\f\r" being transformed to " \t\n\v\\f\r" in the AST. Note the extra "\" after "v". Hence, the answer in #2490 was thus 'not actually' wrong as non-whitespace and lowercase characters did exist.

Fix

def f():
    b: str = " \t\n\v\f\r"
    print(b.isspace())
    print(b.islower())
    print(b.isupper())

f()
(base) saurabh-kumar@Awadh:~/Projects/System/lpython$ ./src/bin/lpython ./examples/example.py
True
False
False

Improvement

Though the error in the above case was due to incorrect string handling, the isspace() method was actually not implemented correctly. The definition of a 'whitespace' character is very broad. The Python interpreter checks for all of them in it's implementation of the method. I incorporated checking for those characters.

kmr-srbh commented 4 months ago

I am adding the tests in a while. @Thirumalai-Shaktivel please see this.

Thirumalai-Shaktivel commented 4 months ago

Also, Let's add some tests.

Shaikh-Ubaid commented 4 months ago

@kmr-srbh Please mark it as ready for review when ready.

kmr-srbh commented 4 months ago

The failing test, though unrelated, was causing disruption here. It was the one which checked for invalid literals like 01. This was fixed in a recent merged PR, so I have removed that test.

@Shaikh-Ubaid This PR is now ready. As far as the Unicode characters are concerned, they cannot be typed, but can appear when parsing strings or related work. They were added conforming to what Python calls a whitespace character.

kmr-srbh commented 4 months ago

Problem persists. Why does ./run_tests.py -u not fix it?

kmr-srbh commented 3 months ago

@Thirumalai-Shaktivel could you please look as to why the tests fail? The previous messages pointed to the null char I had declared for looping through the whitespace characters. I cannot understand why the tests fail now. Please guide me.