geek-ai / MAgent

A Platform for Many-Agent Reinforcement Learning
MIT License
1.69k stars 332 forks source link

EOFError on ubuntu 20.04 #89

Open dameng123 opened 2 years ago

dameng123 commented 2 years ago

I'm sorry to disturb you, but I really need your help. When I run "python examples/train_battle.py--train", this error occurred during the operation. The running process is as follows:

(marl) dzl112@dzl112:~/dameng/python/MAgent$ python examples/train_battle.py --train /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:522: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) From /home/dzl112/dameng/python/MAgent/python/magent/builtin/tf_model/dqn.py:185: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2021-11-16 18:29:04.489464: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA From /home/dzl112/dameng/python/MAgent/python/magent/builtin/tf_model/dqn.py:185: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2021-11-16 18:29:04.942377: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Namespace(alg='dqn', eval=False, greedy=False, load_from=None, map_size=125, n_round=2000, name='battle', render=False, render_every=10, save_every=5, train=True) view_space (13, 13, 7) feature_space (34,) ===== sample ===== eps 1.00 number [625, 625] step 0, nums: [625, 625] reward: [-26.53 -26.23], total_reward: [-26.53 -26.23] step 50, nums: [625, 625] reward: [-26.63 -26.83], total_reward: [-1350.08 -1369.58] step 100, nums: [624, 625] reward: [-27.42 -26.03], total_reward: [-2679.7 -2689.73] step 150, nums: [624, 624] reward: [-27.22 -25.12], total_reward: [-4001.7 -4032.14] step 200, nums: [619, 624] reward: [-25.1 -26.22], total_reward: [-5314.39 -5333.84] step 250, nums: [616, 621] reward: [-25.78 -26.91], total_reward: [-6597.48 -6632.02] step 300, nums: [613, 617] reward: [-24.37 -25.69], total_reward: [-7868.09 -7935.38] step 350, nums: [612, 616] reward: [-26.86 -27.78], total_reward: [-9170.89 -9239.29] step 400, nums: [610, 616] reward: [-25.95 -26.28], total_reward: [-10452.42 -10512.19] step 450, nums: [609, 614] reward: [-26.35 -26.17], total_reward: [-11720.11 -11790.2 ] step 500, nums: [605, 610] reward: [-25.03 -24.95], total_reward: [-12987.11 -13057.87] step 550, nums: [602, 608] reward: [-24.61 -26.94], total_reward: [-14243.59 -14325.24] steps: 551, total time: 36.23, step average 0.07 ===== train ===== batch number: 6663 add: 341185 replay_len: 341185/1048576 batch number: 6625 add: 339220 replay_len: 339220/1048576 batch 0, loss 0.066977, eval 0.008397 batch 0, loss 0.298991, eval -0.234968 batch 1000, loss 0.024740, eval 0.163353 batch 1000, loss 0.009467, eval 0.127751 batch 2000, loss 0.007746, eval 0.121407 batch 2000, loss 0.004247, eval 0.178776 batch 3000, loss 0.001601, eval 0.168760 batch 3000, loss 0.001414, eval 0.217879 batch 4000, loss 0.000912, eval 0.184864 batch 4000, loss 0.001028, eval 0.248757 batch 5000, loss 0.000803, eval 0.196566 batch 5000, loss 0.000792, eval 0.247229 batch 6000, loss 0.000605, eval 0.193597 batch 6000, loss 0.000946, eval 0.254682 batches: 6625, total time: 648.04, 1k average: 97.82 batches: 6663, total time: 650.94, 1k average: 97.69 train_time 652.10 round 0 loss: [0.01, 0.01] num: [602, 608] reward: [-14243.59, -14325.24] value: [0.21, 0.3] round time 688.33 total time 688.33

===== sample ===== eps 1.00 number [625, 625] step 0, nums: [625, 625] reward: [-25.63 -26.43], total_reward: [-25.63 -26.43] step 50, nums: [625, 625] reward: [-27.93 -27.03], total_reward: [-1362.88 -1357.58] step 100, nums: [625, 625] reward: [-24.13 -26.33], total_reward: [-2701.53 -2703.03] step 150, nums: [624, 625] reward: [-28.02 -25.33], total_reward: [-4026.69 -4034.38] step 200, nums: [624, 624] reward: [-25.62 -24.32], total_reward: [-5343.19 -5351.61] step 250, nums: [620, 624] reward: [-24.8 -24.52], total_reward: [-6660.58 -6645.71] step 300, nums: [616, 623] reward: [-27.38 -25.52], total_reward: [-7943.51 -7941.5 ] step 350, nums: [613, 622] reward: [-24.27 -27.21], total_reward: [-9234.7 -9234.86] step 400, nums: [606, 619] reward: [-23.73 -25.1 ], total_reward: [-10486.65 -10484.31] step 450, nums: [605, 618] reward: [-26.33 -26.19], total_reward: [-11764.21 -11796.6 ] step 500, nums: [601, 617] reward: [-24.91 -25.49], total_reward: [-13033.18 -13079.04] step 550, nums: [600, 614] reward: [-26.4 -25.87], total_reward: [-14280.83 -14365.36] steps: 551, total time: 29.94, step average 0.05 ===== train ===== batch number: 6623 add: 339111 replay_len: 678331/1048576 batch 0, loss 0.000958, eval 0.165303 batch number: 6691 add: 342622 replay_len: 683807/1048576 batch 0, loss 0.001485, eval 0.283710 batch 1000, loss 0.000656, eval 0.202931 batch 1000, loss 0.001262, eval 0.292073 batch 2000, loss 0.000821, eval 0.212229 batch 2000, loss 0.001034, eval 0.331045 batch 3000, loss 0.000863, eval 0.247228 batch 3000, loss 0.002247, eval 0.345279 batch 4000, loss 0.000870, eval 0.242257 batch 4000, loss 0.002150, eval 0.378697 batch 5000, loss 0.002529, eval 0.290189 batch 5000, loss 0.002757, eval 0.409260 batch 6000, loss 0.003087, eval 0.417159 batch 6000, loss 0.002246, eval 0.272336 batches: 6623, total time: 593.40, 1k average: 89.60 batches: 6691, total time: 597.11, 1k average: 89.24 train_time 599.24 round 1 loss: [0.0, 0.0] num: [600, 614] reward: [-14280.83, -14365.36] value: [0.29, 0.46] round time 629.18 total time 1317.51

===== sample ===== eps 1.00 number [625, 625] step 0, nums: [625, 625] reward: [-27.43 -26.23], total_reward: [-27.43 -26.23] step 50, nums: [625, 625] reward: [-27.13 -25.93], total_reward: [-1362.18 -1367.68] step 100, nums: [625, 625] reward: [-24.63 -25.53], total_reward: [-2680.03 -2691.03] step 150, nums: [622, 623] reward: [-25.91 -26.82], total_reward: [-3977.77 -3987.96] step 200, nums: [618, 621] reward: [-24.29 -26.71], total_reward: [-5281.69 -5281.74] step 250, nums: [615, 618] reward: [-26.07 -21.89], total_reward: [-6559.37 -6563.23] step 300, nums: [612, 612] reward: [-24.76 -24.56], total_reward: [-7824.65 -7863.79] step 350, nums: [612, 612] reward: [-26.26 -27.66], total_reward: [-9131.35 -9150.69] step 400, nums: [610, 609] reward: [-25.25 -26.85], total_reward: [-10392.15 -10409.55] step 450, nums: [610, 607] reward: [-28.85 -22.94], total_reward: [-11669.25 -11684.32] step 500, nums: [610, 605] reward: [-24.95 -24.23], total_reward: [-12936.55 -12956.71] step 550, nums: [609, 604] reward: [-22.95 -24.92], total_reward: [-14211.54 -14222.43] steps: 551, total time: 31.51, step average 0.06 ===== train ===== batch number: 6630 add: 339483 replay_len: 1017814/1048576 batch 0, loss 0.004495, eval 0.320383 batch 1000, loss 0.004926, eval 0.337636 batch 2000, loss 0.004059, eval 0.395189 batch 3000, loss 0.007036, eval 0.424905 batch 4000, loss 0.007425, eval 0.469079 batch 5000, loss 0.005349, eval 0.484610 batch 6000, loss 0.006193, eval 0.494254 batches: 6630, total time: 355.08, 1k average: 53.56 Traceback (most recent call last): File "examples/train_battle.py", line 221, in eps=eps) # for e-greedy File "examples/train_battle.py", line 124, in play_a_round total_loss[i], value[i] = models[i].fetch_train() File "/home/dzl112/dameng/python/MAgent/python/magent/model.py", line 238, in fetch_train return self.conn.recv() File "/home/dzl112/anaconda3/envs/marl/lib/python3.6/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/home/dzl112/anaconda3/envs/marl/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/home/dzl112/anaconda3/envs/marl/lib/python3.6/multiprocessing/connection.py", line 383, in _recv raise EOFError EOFError

Kipsora commented 2 years ago

While I have seen that other people had this issue before, I am afraid that we cannot provide any concrete help on this issue because this project is not actively maintained and we are lacking of hands now. But from the traceback you posted, it seems that the process was reading an empty pipe (refer here).

An tentative workaround is to use wait() before recv(), I guess you could do (by applying the following patch):

diff --git a/python/magent/model.py b/python/magent/model.py
index a60e793..2f7de98 100644
--- a/python/magent/model.py
+++ b/python/magent/model.py
@@ -208,6 +208,7 @@ class ProcessingModel(BaseModel):
         -------
         actions: numpy array (int32)
         """
+        multiprocessing.connection.wait(self.conn)
         info = self.conn.recv()
         return NDArrayPackage(info).recv_from(self.conn)[0]

@@ -235,6 +236,7 @@ class ProcessingModel(BaseModel):
         value: float
             mean state value
         """
+        multiprocessing.connection.wait(self.conn)
         return self.conn.recv()

     def save(self, save_dir, epoch, block=True):

But I have to put a disclaimer here that this may not be the right solution. You could try and post if there is any problem. But please do not expect prompt response since again, this project is not actively maintained.

dameng123 commented 2 years ago

Thank you very much for your reply. I tried your above method, but the error was: TypeError:'Connection' object is not iterable', as follows:

(marl) dzl112@dzl112:~/dameng/python/MAgent$ python examples/train_battle.py --train /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:522: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/dzl112/anaconda3/envs/marl/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) From /home/dzl112/dameng/python/MAgent/python/magent/builtin/tf_model/dqn.py:185: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2021-11-17 18:57:59.162981: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA From /home/dzl112/dameng/python/MAgent/python/magent/builtin/tf_model/dqn.py:185: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead 2021-11-17 18:57:59.604186: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Namespace(alg='dqn', eval=False, greedy=False, load_from=None, map_size=125, n_round=2000, name='battle', render=False, render_every=10, save_every=5, train=True) view_space (13, 13, 7) feature_space (34,) ===== sample ===== eps 1.00 number [625, 625] Traceback (most recent call last): File "examples/train_battle.py", line 221, in eps=eps) # for e-greedy File "examples/train_battle.py", line 70, in play_a_round acts[i] = models[i].fetch_action() # fetch actions (blocking) File "/home/dzl112/dameng/python/MAgent/python/magent/model.py", line 211, in fetch_action multiprocessing.connection.wait(self.conn) File "/home/dzl112/anaconda3/envs/marl/lib/python3.6/multiprocessing/connection.py", line 904, in wait for obj in object_list: TypeError: 'Connection' object is not iterable

YPEY334Z0B1)XR`~Y SSUD3

Kipsora commented 2 years ago

Could you try to wrap self.conn with a list? Like multiprocessing.connection.wait([self.conn])

dameng123 commented 2 years ago

Thank you very much for your reply. I tried this method again. When the program ran to round 880, the error "EOFError" still appeared. I tried to reduce the number of experience pools, reduce the number of agents to 100, and set the number of rounds to 1000, everything is running normally at this time. Before I ran "train_tiger.py", everything was normal.