mingkaid / rl-prompt

Accompanying repo for the RLPrompt paper
MIT License
286 stars 52 forks source link

BrokenPipeError: [Errno 32] Broken pipe #37

Closed Xinhui-Zhu closed 8 months ago

Xinhui-Zhu commented 9 months ago

I encountered:

Traceback (most recent call last):
  File "/content/drive/MyDrive/rl-prompt/examples/few-shot-classification/run_fsc.py", line 49, in main
    trainer.train(config=config)
  File "/content/drive/MyDrive/rl-prompt/rlprompt/trainers/trainer.py", line 160, in train
    wandb.log(batch_log)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 419, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 370, in wrapper_fn
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 360, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1792, in log
    self._log(data=data, step=step, commit=commit)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1567, in _log
    self._partial_history_callback(data, step, commit)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/wandb_run.py", line 1439, in _partial_history_callback
    self._backend.interface.publish_partial_history(
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/interface/interface.py", line 546, in publish_partial_history
    self._publish_partial_history(partial_history)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/interface/interface_shared.py", line 89, in _publish_partial_history
    self._publish(rec)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/interface/interface_sock.py", line 51, in _publish
    self._sock_client.send_record_publish(record)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/lib/sock_client.py", line 221, in send_record_publish
    self.send_server_request(server_req)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/lib/sock_client.py", line 155, in send_server_request
    self._send_message(msg)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/lib/sock_client.py", line 152, in _send_message
    self._sendall_with_error_handle(header + data)
  File "/usr/local/lib/python3.10/dist-packages/wandb/sdk/lib/sock_client.py", line 130, in _sendall_with_error_handle
    sent = self._sock.send(data)
BrokenPipeError: [Errno 32] Broken pipe
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

when running

python run_fsc.py \
    dataset=sst-2 \
    dataset_seed=0 \
    prompt_length=5

as https://github.com/mingkaid/rl-prompt/tree/main/examples/few-shot-classification

I found on the internet that it is because num_workers is too large and can be solved by setting num_workers=0 when training. But I can't find num_workers in rl-prompt's code. Can you help me?

MM-IR commented 8 months ago

Not sure what you mean... It should be no errors when you follow our instructions.

mingkaid commented 8 months ago

@Xinhui-Zhu Are you running this distributed? wandb is known to throw issues when multiple processes try to call it simultaneously