Open kuixiang opened 3 months ago
Hi, Nice catch!!
I'd defenitly accept a pr!
Looks like the solution you have works.! If I understand, the error happens due to missing bytes in the decode. e.g. (recived, 3 bytes out of 4 for a char).
One small thing though, please raise a proper error (with text, error type should be the KubernetesJobOperator error or some ParseError, see example in errors).
Your code, with the corrections,
prev = ""
prev_binary_chunk = b""
for chunk in response.stream(decode_content=False):
if isinstance(chunk, bytes):
chunk = prev_binary_chunk + chunk
prev_binary_chunk = b""
try:
chunk = chunk.decode("utf8")
except UnicodeDecodeError as e:
# This may happen for the case where
# we have split string chars that have more than one byte per char,
# (say, char has 4 bytes, but we received 3)
# This check needs eplaination as well (thnx!)
if e.end != len(chunk):
raise KubeApiException(
"Error when parsing api response stream"
) from e
prev_binary_chunk = chunk[e.start :]
chunk = chunk[0 : e.start].decode("utf8")
chunk = prev + chunk
lines = chunk.split("\n")
if not chunk.endswith("\n"):
prev = lines[-1]
lines = lines[:-1]
else:
prev = ""
for line in lines:
if line:
yield line
Phenomenon
When a user submits a job through Airflow, it runs for a while and then encounters the following error:
The job gets deleted, and it's impossible to check the failure details later using commands like kubectl describe job or pod.
Cause of the Issue
The original code for retrieving the job context was:
Extracted in chunks and then reassembled. If the job context description contains Chinese characters, such as “开始” (start), they are encoded as \xe5\xbc\x80\xe5\xa7\x8b but might be truncated to \xe5\xbc\x80\xe5 and \xa7\x8b. The \xe5 part is the beginning of a Chinese character. Decoding \xe5\xbc\x80\xe5 as UTF-8 results in an error. Airflow handles the triggered exception by abruptly deleting the job.
Solutions