DragonHPC / dragon

Dragon distributed runtime for HPC and AI applications and workflows
http://dragonhpc.org
MIT License
54 stars 6 forks source link

JSON serialization problem #9

Closed andre-merzky closed 5 months ago

andre-merzky commented 9 months ago

Dear Dragoneers,

I am seeing the following error on several of the example scripts included in the repo:

$ dragon queue_demo.py
Traceback (most recent call last):
  File "/home/merzky/radical/radical.pilot.devel/dragon/examples/multiprocessing/queue_demo.py", line 101, in <module>
    p.start()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/mpbridge/process.py", line 114, in _Popen
    return DragonPopen(process_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/lib/python3.11/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/mpbridge/process.py", line 69, in _launch
    process_obj.proc_desc = create_with_argdata(
                            ^^^^^^^^^^^^^^^^^^^^
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/globalservices/process.py", line 426, in create_with_argdata
    the_desc = create(
               ^^^^^^^
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/globalservices/process.py", line 348, in create
    reply_msg = das.gs_request(req_msg)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/globalservices/api_setup.py", line 193, in gs_request
    req_msg_bytes = req_msg.serialize()
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/merzky/radical/radical.pilot.devel/dragon/src/dragon/infrastructure/messages.py", line 303, in serialize
    return json.dumps(self.get_sdict())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
+++ head proc exited, code 1

I installed this commit:

$ pip freeze | grep dragon
-e git+ssh://git@github.com/DragonHPC/dragon.git@8f17cadebf2f03af6a383c4c593be5de9f3d835b#egg=dragon&subdirectory=src

but had to relax some of the version constraints:

$ git diff -P ../../.devcontainer/constraints.txt 
diff --git i/.devcontainer/constraints.txt w/.devcontainer/constraints.txt
index e23a787..b85b6d0 100644
--- i/.devcontainer/constraints.txt
+++ w/.devcontainer/constraints.txt
@@ -1,5 +1,4 @@
 alabaster==0.7.12
-attrs==22.1.0
 Babel==2.11.0
 black==22.10.0
 breathe==4.34.0
@@ -31,7 +30,6 @@ pluggy==1.0.0
 pycparser==2.21
 Pygments==2.13.0
 pyparsing==3.0.9
-pytest==7.2.0
 pytz==2022.6
 PyYAML==6.0
 requests==2.28.1

as pip would otherwise not be able to resolve all dependencies.

Do you have any advise on how I could debug the json problem? Thanks!

colinpwahl commented 9 months ago

Thanks for investigating and bringing this to our attention! An immediate fix would be to use Python 3.9 or 3.10. The issue is caused by a change made in Python 3.11 to allow for path-like executable objects. This means that as dragon processes were being started a bytes object was given where dragon was expecting a string. The bytes object is not json serializable, hence the error that is raised. We recently noticed this in our testing and have a fix that will be pushed in the next few days.

andre-merzky commented 9 months ago

Thanks @colinpwahl for the fast response! I'll do the recommended switch and otherwise will wait for the fix.

andre-merzky commented 8 months ago

Just want to report that downgrading to 3.10 worked.

kentdlee commented 5 months ago

Great. Sound like this is all worked out. We recently updated to version 0.9 so if you'd like you can can give that a try.