Open LuQQiu opened 3 years ago
Did a testing on local mac, this issue can be reproduce with
pip3 install alluxio
pip3 install -r /alluxio-py/requirements.txt
python3 test.py
import json
import sys
import alluxio
from alluxio import option
client = alluxio.Client('localhost', 39999)
with client.open('/alluxio-2.7.0-SNAPSHOT-client.jar', 'r') as f:
a = f.read(100)
import time
for i in range(40):
time.sleep(1)
print(f'{i} ', end='\r')
a = f.read(1024 * 1024 * 1024 * 2)
print(f"finish, {len(a)}")
The file is only 27MB.
when it counts to 39, the following error occur
Traceback (most recent call last):
File "/Users/alluxio/alluxioFolder/alluxio-py/test.py", line 13, in <module>
a = f.read(1024 * 1024 * 1024 * 2)
File "/Users/alluxio/alluxioFolder/alluxio-py/alluxio/client.py", line 620, in read
return self.response.raw.read(num)
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 461, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/usr/local/Cellar/python@3.9/3.9.0_2/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 135, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python3.9/site-packages/urllib3/response.py", line 380, in _error_catcher
raise ProtocolError('Connection broken: %r' % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(524188 bytes read)', IncompleteRead(524188 bytes read))
I add logs to StreamsRestServiceHandler.read() and close().
In our code, i see the open returns the alluxio.client.file.FileInStream
. After the error shown, the log shows the close() is called to invalidate the file in stream cache.
I doubt some code in alluxio-py timeout instead of code in proxy timeout. There is no timeout logics in proxy code.
https://github.com/tweepy/tweepy/issues/908 alluxio-py has some dependencies that may raise this issue. We didn't timeout from the alluxio proxy side or alluxio-py side
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.
Alluxio Version:
Describe the bug When reading a big file using alluxio python sdk, the python streaming function (tensorflow application) will read some bytes, rest for a while, and then read some bytes, and then rest for a while. If the application rest for about 40 seconds, the following read will not be successful. Looks like the fuse read connection is broken. When reading small files, no issue occur.
To Reproduce Steps to reproduce the behavior (as minimally and precisely as possible)
Expected behavior A clear and concise description of what you expected to happen.
Urgency Describe the impact and urgency of the bug.
Additional context Add any other context about the problem here.