Closed andychu closed 5 years ago
@jyn514 Here's a simpler repro. (not sure why the last one was flaky)
A newline in the prompt causes this problem. So I think we should check -1
and then return the fallback... or maybe somehow indicate a warning? newlines in prompts are probably not uncommon.
Maybe we have to additionally split lines? And only calculate width of last line? So there could be 2 changes here -- check -1, and account for newlines in another way.
andy@lisa:~/git/oilshell/oil$ bin/osh --rcfile /dev/null
osh$ PS1=$'\n'
ls
.git/ .gitignore .mypy_cache/ .swp .travis.yml .vagrant/ INSTALL.txt LICENSE.txt Makefile NOTES-cpython.txt
NOTES.txt Python-2.7.13/ README.md Rplots.pdf TODO-completion.txt TODO.txt Vagrantfile __init__.py __pycache__/ _bin/
_build/ _chroot/ _deps/ _devbuild/ _release/ _tmp/ array.patch asdl.txt asdl/ benchmarks/
bin/ build/ configure core/ demo/ devtools/ dir/ doc/ edit.txt fastlex.pyi
fastlex.so file frontend/ gold/ install libc.pyi libc.so line_input.so local.sh logs-0.6.pre13.tar
metrics/ misc/ mycpp/ native/ oil-version.txt oil_lang/ opy/ osh/ ovm2/ pgen2/
portable-rules.mk posix_.pyi posix_.so pylib/ r.txt release-notes.txt release-roadmap.txt spec/ test/ testdata/
tmp.txt tools/ types/ uftrace.data.old/ uftrace.data/ vendor/ web/
Traceback (most recent call last):
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 89, in PrintCandidates
self._PrintCandidates(*args)
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 374, in _PrintCandidates
self._ReturnToPrompt(num_lines+1)
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 309, in _ReturnToPrompt
assert last_prompt_len != -1
AssertionError
I'm not sure that the autocomplete UI would work even if we only counted the last line, that's something to look into. -1 should probably be handled the same way as other errors, by just counting the number of bytes. We should also print a warning if the prompt is invalid or the UTF8 locale isn't installed instead of silently failing.
From https://github.com/oilshell/oil/pull/368#issuecomment-508510117:
I can't replicate the error for
PS1=$'\xe2\x86\x92'
on my machine. Here's my locale:LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
I noticed you have LANGUAGE=en_US, could you try changing that and seeing what happens? Currently we only change LC_CTYPE because that's what the linux man pages document, maybe it's different on MacOS.
The newline threw an assertion error for me as well.
Yeah it's possible we may need to take into account the prompt height in the future. But for now I think fixing the AssertionError is OK.
I would like to print a message rather than silently failing, but I'm not sure where to do it.
I still get the same error:
[osh] lisa ~/git/oilshell/oil$ export LANGUAGE=
[osh] lisa ~/git/oilshell/oil$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
[osh] lisa ~/git/oilshell/oil$ PS1=$'\n'
ls
.git/ .gitignore .mypy_cache/ .swp .travis.yml .vagrant/ INSTALL.txt LICENSE.txt Makefile NOTES-cpython.txt
NOTES.txt Python-2.7.13/ README.md Rplots.pdf TODO-completion.txt TODO.txt Vagrantfile __init__.py __pycache__/ _bin/
_build/ _chroot/ _deps/ _devbuild/ _release/ _tmp/ array.patch asdl.txt asdl/ benchmarks/
bin/ build/ configure core/ demo/ devtools/ dir/ doc/ edit.txt fastlex.pyi
fastlex.so file frontend/ gold/ install libc.pyi libc.so line_input.so local.sh logs-0.6.pre13.tar
metrics/ misc/ mycpp/ native/ oil-version.txt oil_lang/ opy/ osh/ ovm2/ pgen2/
portable-rules.mk posix_.pyi posix_.so pylib/ r.txt release-notes.txt release-roadmap.txt spec/ test/ testdata/
tmp.txt tools/ types/ uftrace.data.old/ uftrace.data/ vendor/ web/
Traceback (most recent call last):
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 89, in PrintCandidates
self._PrintCandidates(*args)
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 374, in _PrintCandidates
self._ReturnToPrompt(num_lines+1)
File "/home/andy/git/oilshell/oil/core/comp_ui.py", line 309, in _ReturnToPrompt
assert last_prompt_len != -1
AssertionError
I meant for the unicode arrow, not for the newline.
Does it make a difference? either way we have to handle -1.
I must have made some kind of error minimizing the case. The full prompt in the stackoverflow case failed. But it looks like the cause was the \n and not the arrow.
How would you expect this to behave for a newline inside an escape sequence? For example PS1=$'abc\x01\033[0;31m\n\x02> '
, should it be len('abc> ')
or len('> ')
? Or is it malformed and not worth testing?
I'm not sure... I try to come up with a strategy to test what bash does when I don't know.
But I think you can just do a mechanical change to fall back on len() if -1
is returned.
That is, follow the existing logic and remove everything inside \x01
\x02
, and then calculate the width.
"monotonically increasing correctness" is OK... i.e. I would fix this bug first and then worry about multiline prompts, which may require larger changes to other parts of the code.
The important part is really testing. So when we refactor later, we don't break this.
In this case I would add a unit tests for the PromptLen function. It can return -1 now, but it never should, because that breaks an invariant in the rest of the program (caught by the assertion).
Prior to the unicode change, the function could never return -1.
Do you use property-based testing? It would be nice to have something like https://hypothesis.readthedocs.io/en/latest/quickstart.html that asserts that the function never returns -1 instead of coming up with test cases manually.
Something like this:
diff --git a/core/comp_ui_test.py b/core/comp_ui_test.py
index 7650f7ce..aa95a981 100755
--- a/core/comp_ui_test.py
+++ b/core/comp_ui_test.py
@@ -7,6 +7,8 @@ from __future__ import print_function
import cStringIO
import sys
import unittest
+import hypothesis
+from hypothesis.strategies import text
from core import comp_ui # module under test
from core import util
@@ -170,9 +172,9 @@ class PromptTest(unittest.TestCase):
self.assertEqual(comp_ui._PromptLen("abc\ndef"), 3)
self.assertEqual(comp_ui._PromptLen(""), 0)
- def testControlCharacters(self):
- self.assertEqual(comp_ui._PromptLen("\xef"), 1)
- self.assertEqual(comp_ui._PromptLen("\x03\x05"), 2)
+ @hypothesis.given(text())
+ def testNeverPanics(self, s):
+ assert comp_ui._PromptLen(s) >= 0, repr(s)
if __name__ == '__main__':
unittest.main()
diff --git a/vendor/typing.py b/vendor/typing.py
index b0fdb1c2..a9a700bf 100644
--- a/vendor/typing.py
+++ b/vendor/typing.py
@@ -10,6 +10,9 @@
# if TYPE_CHECKING:
# NullFunc = Callable[[int, int], int]
+TypingMeta = None
+TypeVar = None
+_ForwardRef = None
List = None
Tuple = None
Optional = None
It's something I've been interested in, but haven't gotten around to trying.
Does it actually fail with the old code in this case? Like it manages to find the \n
? I guess it can hard-code those special strings and try it.
If so it might be nice to start trying it. We would have to sort ouf the PIP / Travis dependencies, which shouldn't be hard.
(Also if it takes a long time to run, I would put them in a different file so we can selectively run them. I like that the unit tests run pretty fast, and have no deps)
It runs pretty fast, but yes it introduces a dependency. It caught the -1, I think it came up with \x1f
or something like that.
OK great, I would accept a PR to add it. Although I would still be inclined to put things in a different file to start with. When we gain confidence with it, we can move them into the normal test files.
Maybe one per dir, like osh/property_tests.py
and core/property_tests.py
? The _tests.py
suffix shouldn't get picked up by test/unit.sh
, which is probably what we want at first.
Running it in Travis would be nice too but that could be a followup change.
The one caveat I have is that internal invariants are subject to change / refactoring -- hence my preference for spec tests over unit tests. But yes in this case there was a specific runtime invariant that the assert
was protecting.
This just caught another error! PyArg_ParseTuple throws a TypeError if a string contains a null byte.
OK interesting. Yeah I have been wondering about this... I think this will happen in all the other C extension too.
bash silently truncates.
$ echo $'foo\x00bar'
foo
mksh seems not to though...
Python builtins raise an error:
>>> os.listdir('bin\0')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: listdir() argument 1 must be encoded string without null bytes, not str
Not sure yet what we should do... great find though!!!
Hmm python seems to handle it fine as long as it doesn't go through a C extension:
(osh) $ echo $'foo\x00bar'
foobar
(osh) $ PS1=$'foo > \x00bar > '
foo >
Released with 0.7.pre1: https://www.oilshell.org/release/0.7.pre1/
I just merged this, but when I tested it against some random unicode prompt, libc.wcswidth() is returning -1? That causes an AssertionError later.
I reduced it to this, which works in bash:
Can you reproduce this on your machine? Either way, we should handle the -1 return value.
But I'm not sure why this is happening. Maybe the
'\x01
stripping isn't right? That would be an unprintable character.https://unix.stackexchange.com/questions/25903/awesome-symbols-and-characters-in-a-bash-prompt
Originally posted by @andychu in https://github.com/oilshell/oil/pull/368#issuecomment-508335695