Xubuntu 15.10, sailfish from master (f111f6e4a0953357f0871374aa825bc2eaafc2a0), ATI R9 290
If I launch the Lid-Driven cavity, everything seems to be working fine... Unfortunately it is suddenly hanging:
[ 1751 INFO Master/sobremesa] Machine master starting with PID 25192 at 2016-04-07 18:18:13 UTC
[ 1751 INFO Master/sobremesa] Simulation started with: ./ldc_2d.py
[ 1760 INFO Master/sobremesa] Sailfish version: f111f6e4a0953357f0871374aa825bc2eaafc2a0
[ 1761 INFO Master/sobremesa] Handling subdomains: [0]
[ 1761 INFO Master/sobremesa] Subdomain -> GPU map: {0: 0}
[ 1764 INFO Master/sobremesa] Selected backend: opencl
[ 2291 INFO Subdomain/0] Initializing subdomain.
[ 2291 INFO Subdomain/0] Required memory:
[ 2291 INFO Subdomain/0] . distributions: 5 MiB
[ 2291 INFO Subdomain/0] . fields: 0 MiB
[ 2422 INFO Subdomain/0] On-GPU invalid result check disabled as the device does not support all required features.
/home/pepe/Downloads/sailfish/sailfish/backend_opencl.py:159: UserWarning: Received OpenCL source code in Unicode, should be ASCII string. Attempting conversion.
return cl.Program(self.ctx, preamble + source).build() #'-cl-single-precision-constant -cl-fast-relaxed-math')
[ 5056 WARNING Subdomain/0] Running infinite simulation.
[ 5056 INFO Subdomain/0] Starting simulation.
[ 5510 INFO Subdomain/0] iteration:2000 speed:277.77 MLUPS
[ 5727 INFO Subdomain/0] iteration:3000 speed:295.56 MLUPS
[ 5951 INFO Subdomain/0] iteration:4000 speed:288.61 MLUPS
[ 6175 INFO Subdomain/0] iteration:5000 speed:288.83 MLUPS
[ 6441 INFO Subdomain/0] iteration:6000 speed:243.48 MLUPS
[ 6753 INFO Subdomain/0] iteration:7000 speed:208.11 MLUPS
[ 7033 INFO Subdomain/0] iteration:8000 speed:230.93 MLUPS
[ 7318 INFO Subdomain/0] iteration:9000 speed:227.47 MLUPS
[ 7574 INFO Subdomain/0] iteration:10000 speed:252.54 MLUPS
[ 7808 INFO Subdomain/0] iteration:11000 speed:276.91 MLUPS
[ 8067 INFO Subdomain/0] iteration:12000 speed:250.54 MLUPS
[ 8304 INFO Subdomain/0] iteration:13000 speed:273.10 MLUPS
[ 8595 INFO Subdomain/0] iteration:14000 speed:222.76 MLUPS
[ 8858 INFO Subdomain/0] iteration:15000 speed:246.14 MLUPS
[ 9052 INFO Subdomain/0] iteration:16000 speed:333.59 MLUPS
[ 9260 INFO Subdomain/0] iteration:17000 speed:311.17 MLUPS
[ 9503 INFO Subdomain/0] iteration:18000 speed:266.69 MLUPS
[ 9774 INFO Subdomain/0] iteration:19000 speed:238.98 MLUPS
[ 10013 INFO Subdomain/0] iteration:20000 speed:271.23 MLUPS
[ 10268 INFO Subdomain/0] iteration:21000 speed:253.38 MLUPS
[ 10535 INFO Subdomain/0] iteration:22000 speed:243.09 MLUPS
[ 10782 INFO Subdomain/0] iteration:23000 speed:262.50 MLUPS
[ 11032 INFO Subdomain/0] iteration:24000 speed:258.22 MLUPS
[ 11283 INFO Subdomain/0] iteration:25000 speed:258.77 MLUPS
[ 11527 INFO Subdomain/0] iteration:26000 speed:265.50 MLUPS
[ 11791 INFO Subdomain/0] iteration:27000 speed:245.31 MLUPS
[ 12058 INFO Subdomain/0] iteration:28000 speed:242.33 MLUPS
[ 12311 INFO Subdomain/0] iteration:29000 speed:255.68 MLUPS
[ 12564 INFO Subdomain/0] iteration:30000 speed:256.76 MLUPS
[ 12818 INFO Subdomain/0] iteration:31000 speed:254.30 MLUPS
[ 13066 INFO Subdomain/0] iteration:32000 speed:261.79 MLUPS
[ 13491 INFO Subdomain/0] iteration:33000 speed:152.45 MLUPS
[ 13741 INFO Subdomain/0] iteration:34000 speed:259.01 MLUPS
[ 14018 INFO Subdomain/0] iteration:35000 speed:233.74 MLUPS
[ 14260 INFO Subdomain/0] iteration:36000 speed:267.39 MLUPS
[ 14510 INFO Subdomain/0] iteration:37000 speed:258.93 MLUPS
If I cancel the job, it seems to be a synchronization problem between threads:
File "./ldc_2d.py", line 41, in <module>
ctrl.run()
File "/home/pepe/Downloads/sailfish/sailfish/controller.py", line 793, in run
return self._finish_simulation(subdomain_specs, summary_receiver)
File "/home/pepe/Downloads/sailfish/sailfish/controller.py", line 708, in _finish_simulation
self._simulation_process.join()
File "/usr/lib/python2.7/multiprocessing/process.py", line 145, in join
res = self._popen.wait(timeout)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 154, in wait
return self.poll(0)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 135, in poll
pid, sts = os.waitpid(self.pid, flag)
However, if I launch the case with the following command:
./ldc_2d.py --debug_single_process
It is hanging again:
[ 1718 INFO MainProcess] Machine master starting with PID 25261 at 2016-04-07 18:21:15 UTC
[ 1718 INFO MainProcess] Simulation started with: ./ldc_2d.py --debug_single_process
[ 1728 INFO MainProcess] Sailfish version: f111f6e4a0953357f0871374aa825bc2eaafc2a0
[ 1729 INFO MainProcess] Handling subdomains: [0]
[ 1729 INFO MainProcess] Subdomain -> GPU map: {0: 0}
[ 1730 INFO MainProcess] Selected backend: opencl
[ 2273 INFO MainProcess] Initializing subdomain.
[ 2273 INFO MainProcess] Required memory:
[ 2273 INFO MainProcess] . distributions: 5 MiB
[ 2273 INFO MainProcess] . fields: 0 MiB
[ 2448 INFO MainProcess] On-GPU invalid result check disabled as the device does not support all required features.
/home/pepe/Downloads/sailfish/sailfish/backend_opencl.py:159: UserWarning: Received OpenCL source code in Unicode, should be ASCII string. Attempting conversion.
return cl.Program(self.ctx, preamble + source).build() #'-cl-single-precision-constant -cl-fast-relaxed-math')
[ 5546 WARNING MainProcess] Running infinite simulation.
[ 5564 INFO MainProcess] Starting simulation.
[ 6078 INFO MainProcess] iteration:2000 speed:266.26 MLUPS
[ 6288 INFO MainProcess] iteration:3000 speed:304.68 MLUPS
[ 6513 INFO MainProcess] iteration:4000 speed:287.41 MLUPS
[ 6740 INFO MainProcess] iteration:5000 speed:285.69 MLUPS
[ 6966 INFO MainProcess] iteration:6000 speed:286.89 MLUPS
[ 7199 INFO MainProcess] iteration:7000 speed:278.13 MLUPS
[ 7452 INFO MainProcess] iteration:8000 speed:255.82 MLUPS
[ 7703 INFO MainProcess] iteration:9000 speed:257.62 MLUPS
[ 7921 INFO MainProcess] iteration:10000 speed:297.96 MLUPS
[ 8164 INFO MainProcess] iteration:11000 speed:266.58 MLUPS
[ 8382 INFO MainProcess] iteration:12000 speed:296.28 MLUPS
[ 8632 INFO MainProcess] iteration:13000 speed:259.16 MLUPS
[ 8895 INFO MainProcess] iteration:14000 speed:246.05 MLUPS
[ 9125 INFO MainProcess] iteration:15000 speed:282.82 MLUPS
[ 9355 INFO MainProcess] iteration:16000 speed:281.31 MLUPS
[ 9590 INFO MainProcess] iteration:17000 speed:275.48 MLUPS
[ 9839 INFO MainProcess] iteration:18000 speed:260.35 MLUPS
[ 10076 INFO MainProcess] iteration:19000 speed:272.75 MLUPS
[ 10351 INFO MainProcess] iteration:20000 speed:235.59 MLUPS
[ 10625 INFO MainProcess] iteration:21000 speed:236.49 MLUPS
[ 11062 INFO MainProcess] iteration:22000 speed:148.00 MLUPS
[ 11284 INFO MainProcess] iteration:23000 speed:292.25 MLUPS
[ 11503 INFO MainProcess] iteration:24000 speed:295.61 MLUPS
[ 11764 INFO MainProcess] iteration:25000 speed:248.77 MLUPS
[ 12020 INFO MainProcess] iteration:26000 speed:252.55 MLUPS
[ 12274 INFO MainProcess] iteration:27000 speed:254.87 MLUPS
[ 12531 INFO MainProcess] iteration:28000 speed:252.07 MLUPS
[ 12779 INFO MainProcess] iteration:29000 speed:261.26 MLUPS
Xubuntu 15.10, sailfish from master (f111f6e4a0953357f0871374aa825bc2eaafc2a0), ATI R9 290
If I launch the Lid-Driven cavity, everything seems to be working fine... Unfortunately it is suddenly hanging:
If I cancel the job, it seems to be a synchronization problem between threads:
However, if I launch the case with the following command:
./ldc_2d.py --debug_single_process
It is hanging again:
And this time I cannot cancel the job :-S