dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.53k stars 1.45k forks source link

disable yaml parsing in to date objects at load time #2654

Open b0nj0m0n opened 4 years ago

b0nj0m0n commented 4 years ago

Pipeline executes when the .py is run. As soon as I converted to 0.8.4, dagit would no longer start. Tried the same dagit -f my_pipeline.py , tried a workspace.yaml, tried a repo.py - all threw the same message. Seems to be trying to JSON serialize a date.

Reverted to 0.7.14 and dagit started fine. Running on Windows/Anaconda with dagit 0.8.4, dagster 0.8.4, dagster-pandas 0.8.4.

Traceback (most recent call last): File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) ... File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\api\snapshot_repository.py", line 28, in sync_get_external_repositories ExternalRepositoryData, File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\check__init.py", line 121, in inst raise_with_traceback(_type_mismatch_error(obj, ttype, desc)) File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\future\utils__init.py", line 446, in raise_with_traceback raise exc.with_traceback(traceback) dagster.check.CheckError: Object IPCErrorMessage(serializable_error_info=SerializableErrorInfo(message='TypeError: Object of type date is not JSON serializable\n', stack=[' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 116, in ipc_write_stream\n yield FileBasedWriteStream(file_path)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 35, in ipc_write_unary_response\n stream.send(obj)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 91, in send\n _send(self._file_path, dagster_named_tuple)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 99, in _send\n fp.write(serialize_dagster_namedtuple(obj) + \'\n\')\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\init.py", line 209, in serialize_dagster_namedtuple\n **json_kwargs\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\init.py", line 187, in _serialize_dagster_namedtuple\n return seven.json.dumps(_pack_value(nt, enum_map, tuple_map), json_kwargs)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\init__.py", line 238, in dumps\n kw).encode(obj)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 199, in encode\n chunks = self.iterencode(o, _one_shot=True)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 257, in iterencode\n return _iterencode(o, 0)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 179, in default\n raise TypeError(f\'Object of type {o.class.name} \'\n'], cls_name='TypeError', cause=None), message=None) is not a ExternalRepositoryData. Got IPCErrorMessage(serializable_error_info=SerializableErrorInfo(message='TypeError: Object of type date is not JSON serializable\n', stack=[' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 116, in ipc_write_stream\n yield FileBasedWriteStream(file_path)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 35, in ipc_write_unary_response\n stream.send(obj)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 91, in send\n _send(self._file_path, dagster_named_tuple)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\ipc.py", line 99, in _send\n fp.write(serialize_dagster_namedtuple(obj) + \'\n\')\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\init.py", line 209, in serialize_dagster_namedtuple\n json_kwargs\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\site-packages\dagster\serdes\init.py", line 187, in _serialize_dagster_namedtuple\n return seven.json.dumps(_pack_value(nt, enum_map, tuple_map), json_kwargs)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\init.py", line 238, in dumps\n kw).encode(obj)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 199, in encode\n chunks = self.iterencode(o, _one_shot=True)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 257, in iterencode\n return _iterencode(o, 0)\n', ' File "c:\users\bsmith\appdata\local\continuum\anaconda3\lib\json\encoder.py", line 179, in default\n raise TypeError(f\'Object of type {o.class.name__} \'\n'], cls_name='TypeError', cause=None), message=None) with type <class 'dagster.serdes.ipc.IPCErrorMessage'>.**

alangenfeld commented 4 years ago

Thanks for the report. Do you know where the date object is coming from? I am guessing it is a piece of metadata on a pipeline in your repository.

b0nj0m0n commented 4 years ago

This pipeline dynamically loads a resource from an external YAML file with multiple dates. Does this mean that dates can't be used in configuration files?

alangenfeld commented 4 years ago

No we just need to stop letting the yaml parser turn them in to date objects at load time.

Good catch!