sot / kadi

Chandra commands and events
https://sot.github.io/kadi
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

occweb test failure #112

Closed taldcroft closed 9 months ago

taldcroft commented 6 years ago

On current master I am getting a failure like this on ska and ska3. Note that if I only run the one test that is failing here, it then succeeds. I can't remember if we've seen this before. Does this ring a bell @jeanconn? Maybe ping Brandon?

ska3-kadi$ python setup.py test --args='-k occweb'
/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/testr/__init__.py
running test
running egg_info
writing kadi.egg-info/PKG-INFO
writing dependency_links to kadi.egg-info/dependency_links.txt
writing entry points to kadi.egg-info/entry_points.txt
writing top-level names to kadi.egg-info/top_level.txt
reading manifest file 'kadi.egg-info/SOURCES.txt'
writing manifest file 'kadi.egg-info/SOURCES.txt'
running build_ext
====================================================== test session starts =======================================================
platform linux -- Python 3.6.2, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /data/baffin/tom/git/kadi, inifile:
collected 48 items                                                                                                                

kadi/tests/test_occweb.py .F...

============================================================ FAILURES ============================================================
_____________________________________________________ test_put_get_user_none _____________________________________________________

self = <paramiko.Transport at 0x103d1f28 (unconnected)>

    def _check_banner(self):
        # this is slow, but we only have to do it once
        for i in range(100):
            # give them 15 seconds for the first line, then just 2 seconds
            # each additional line.  (some sites have very high latency.)
            if i == 0:
                timeout = self.banner_timeout
            else:
                timeout = 2
            try:
>               buf = self.packetizer.readline(timeout)

/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py:1893: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <paramiko.packet.Packetizer object at 0x7ff9103d1c50>, timeout = 15

    def readline(self, timeout):
        """
            Read a line from the socket.  We assume no data is pending after the
            line, so it's okay to attempt large reads.
            """
        buf = self.__remainder
        while not linefeed_byte in buf:
>           buf += self._read_timeout(timeout)

/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/packet.py:331: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <paramiko.packet.Packetizer object at 0x7ff9103d1c50>, timeout = 15

    def _read_timeout(self, timeout):
        start = time.time()
        while True:
            try:
                x = self.__socket.recv(128)
                if len(x) == 0:
                    raise EOFError()
                break
            except socket.timeout:
                pass
            except EnvironmentError as e:
                if (type(e.args) is tuple and len(e.args) > 0 and
                        e.args[0] == errno.EINTR):
                    pass
                else:
                    raise
            if self.__closed:
                raise EOFError()
            now = time.time()
            if now - start >= timeout:
>               raise socket.timeout()
E               socket.timeout

/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/packet.py:501: timeout

During handling of the above exception, another exception occurred:

    @pytest.mark.skipif('not HAS_LUCKY')
    def test_put_get_user_none():
        # Test the user=None code branch (gets username back from SFTP object, which
        # had previously gotten it from the netrc file).
>       _test_put_get(user=None)

kadi/tests/test_occweb.py:75: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
kadi/tests/test_occweb.py:48: in _test_put_get
    occweb.ftp_get_from_lucky(remote_tmpdir, local_filenames, user=user)
kadi/occweb.py:143: in ftp_get_from_lucky
    ftp = Ska.ftp.SFTP('lucky', logger=logger, user=user)
/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/Ska.ftp-3.5-py3.6.egg/Ska/ftp/ftp.py:72: in __init__
    transport.connect(username=user, password=passwd)
/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py:1086: in connect
    self.start_client()
/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py:500: in start_client
    raise e
/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py:1749: in run
    self._check_banner()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <paramiko.Transport at 0x103d1f28 (unconnected)>

    def _check_banner(self):
        # this is slow, but we only have to do it once
        for i in range(100):
            # give them 15 seconds for the first line, then just 2 seconds
            # each additional line.  (some sites have very high latency.)
            if i == 0:
                timeout = self.banner_timeout
            else:
                timeout = 2
            try:
                buf = self.packetizer.readline(timeout)
            except ProxyCommandFailure:
                raise
            except Exception as e:
>               raise SSHException('Error reading SSH protocol banner' + str(e))
E               paramiko.ssh_exception.SSHException: Error reading SSH protocol banner

/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py:1897: SSHException
------------------------------------------------------ Captured stderr call ------------------------------------------------------
Exception: Error reading SSH protocol banner
Traceback (most recent call last):
  File "/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py", line 1893, in _check_banner
    buf = self.packetizer.readline(timeout)
  File "/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/packet.py", line 331, in readline
    buf += self._read_timeout(timeout)
  File "/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/packet.py", line 501, in _read_timeout
    raise socket.timeout()
socket.timeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py", line 1749, in run
    self._check_banner()
  File "/proj/sot/ska3/flight/arch/x86_64-linux_CentOS-6/lib/python3.6/site-packages/paramiko/transport.py", line 1897, in _check_banner
    raise SSHException('Error reading SSH protocol banner' + str(e))
paramiko.ssh_exception.SSHException: Error reading SSH protocol banner

======================================================== warnings summary ========================================================
None
  passing a string to pytest.main() is deprecated, pass a list of arguments instead.

-- Docs: http://doc.pytest.org/en/latest/warnings.html
====================================================== 43 tests deselected =======================================================
================================= 1 failed, 4 passed, 43 deselected, 1 warnings in 50.97 seconds =================================
jeanconn commented 6 years ago

Yes, I've seen this. I was trying to fix Ska.ftp, but just kept running into this and didn't get back to it. I think that there's some throttling going on on lucky sftp. I was seeing the same thing... one test works fine, but the second test and following will fail. And yes, if they do actually have some throttling set up we should find out what the rules are (though I don't know if implementing the rules into our public-code tests would be a security concern).

jeanconn commented 6 years ago

At one point I tried to get an update to paramiko because I thought it was an actual SSH protocol issue, thanks to that "Error reading SSH protocol banner" text, but really that just means it doesn't have a connection anymore.

taldcroft commented 9 months ago

This hasn't come up recently AFAIK.