deepmodeling / dpdispatcher

generate HPC scheduler systems jobs input scripts and submit these scripts to HPC systems and poke until they finish
https://docs.deepmodeling.com/projects/dpdispatcher/
GNU Lesser General Public License v3.0
42 stars 56 forks source link

UnicodeDecodeError: 'utf-8' codec can't decode #427

Open Franklalalala opened 8 months ago

Franklalalala commented 8 months ago

When the job is finished, dpdispatcher will check the log file. (See code) And this will encounter a UnicodeDecodeError:

  File "/opt/mamba/lib/python3.10/site-packages/dpdispatcher/submission.py", line 260, in run_submission
    self.update_submission_state()
  File "/opt/mamba/lib/python3.10/site-packages/dpdispatcher/submission.py", line 345, in update_submission_state
    job.get_job_state()
  File "/opt/mamba/lib/python3.10/site-packages/dpdispatcher/submission.py", line 831, in get_job_state
    job_state = self.machine.check_status(self)
  File "/opt/mamba/lib/python3.10/site-packages/dpdispatcher/machines/dp_cloud_server.py", line 211, in check_status
    job_log = self.api.get_log(job_id)
  File "/opt/mamba/lib/python3.10/site-packages/dpdispatcher/utils/dpcloudserver/client.py", line 281, in get_log
    return resp.content.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte

This error seems random. For example, I have 10 jobs, 6 jobs could be successfully downloaded. The rest of jobs could not be automatically downloaded for this error.

Franklalalala commented 8 months ago

I can comment out line 211,212,213,217,218 in dp_cloud_server.py to avoid this error. But is there a better way to fix this? Or which log file should I check to avoid this error?

njzjz commented 8 months ago

It seems that the upstream API gives an incorrect response. The code here should be correct.

njzjz commented 8 months ago

cc @xiaoyeqiannian