dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.53k stars 1.45k forks source link

UnicodeDecodeError in -w mode, versions 1.6.4 to 1.7.13 #22998

Open hanzch opened 3 months ago

hanzch commented 3 months ago

Dagster version

1.6.4 ~ 1.7.13

What's the issue?

Dear developer, this is a coding BUG based on -w mode.

image

RESOURCE_INIT_SUCCESS - Finished initialization of resources [io_manager].
Traceback (most recent call last):
  File "D:\Programming\miniconda3\envs\streamline\lib\site-packages\dagster\_core\execution\poll_compute_logs.py", line 60, in <module>
    execute_polling(sys.argv[1:])
  File "D:\Programming\miniconda3\envs\streamline\lib\site-packages\dagster\_core\execution\poll_compute_logs.py", line 55, in execute_polling
    tail_polling(filepath, sys.stdout, parent_pid)
  File "D:\Programming\miniconda3\envs\streamline\lib\site-packages\dagster\_core\execution\poll_compute_logs.py", line 31, in tail_polling
    for block in iter(lambda: file.read(1024), None):
  File "D:\Programming\miniconda3\envs\streamline\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 509: invalid continuation byte

What did you expect to happen?

No response

How to reproduce?

test_code:

f1.py

import dagster
logger = dagster.get_dagster_logger()

@dagster.asset(description='f1', group_name='t_group')
def factory_f1():
    logger.debug("中文测试")
    logger.debug("111111111")

f1_every_day_job = dagster.define_asset_job("f1_every_day_job",
                                            selection=dagster.AssetSelection.groups("t_group"))
f2_schedule = dagster.ScheduleDefinition(
    job=f1_every_day_job,
    cron_schedule="0 1 * * 2-6",
    execution_timezone="Asia/Shanghai",
    default_status=dagster.DefaultScheduleStatus.RUNNING,
)

if __name__ == '__main__':
    logger.debug("中文测试")

Deployment type

Local

Deployment details

workspace.yaml

load_from:
  - python_file: ./f2.py

start: dagster dev -w workspace.yaml

Additional information

Note that the above code does not error through -f.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

garethbrickman commented 3 months ago

If you remove the Chinese characters in the code, does the error still occur? I suspect those characters are not UTF-8 encodeable

hanzch commented 3 months ago

If you remove the Chinese characters in the code, does the error still occur? I suspect those characters are not UTF-8 encodeable

Thank you for your reply. As you said, the encoding problem is indeed caused by Chinese characters. But I am more confused is that it can run normally in -f mode, is it because -w mode and -f processing of coding is not the same?