choderalab / fahmunge

Tools for Munging Folding@Home datasets
MIT License
4 stars 6 forks source link

Multiprocessing version leaving a lot of processes hanging #21

Closed jchodera closed 8 years ago

jchodera commented 8 years ago

It looks like the multiprocessing version can keep a lot of processes hanging. After a couple days of running, I noted 211 processes:

-bash-4.1$ ps xauwww | grep python
server    3259  0.0  0.0 103308   876 pts/0    S+   18:40   0:00 grep python
server    4000  1.6  0.0 1071708 203424 pts/2  Sl+  Mar05  24:25 python scripts/munge_fah_data_parallel.py
server    4001  2.2  0.0 1094348 227188 pts/2  Sl+  Mar05  32:36 python scripts/munge_fah_data_parallel.py
server    4002  1.4  0.0 1074172 206636 pts/2  Sl+  Mar05  21:49 python scripts/munge_fah_data_parallel.py
server    4004  1.5  0.0 1072044 204540 pts/2  Sl+  Mar05  22:57 python scripts/munge_fah_data_parallel.py
server    4005  1.5  0.0 1097284 229792 pts/2  Sl+  Mar05  22:37 python scripts/munge_fah_data_parallel.py
server    4006  1.5  0.0 1073412 205988 pts/2  Sl+  Mar05  22:11 python scripts/munge_fah_data_parallel.py
server    4007  1.5  0.0 1095004 227732 pts/2  Sl+  Mar05  22:48 python scripts/munge_fah_data_parallel.py
server    4008  1.8  0.0 1072780 205260 pts/2  Sl+  Mar05  27:42 python scripts/munge_fah_data_parallel.py
server    4009  1.6  0.0 1041116 173764 pts/2  Sl+  Mar05  24:21 python scripts/munge_fah_data_parallel.py
server    4010  1.5  0.0 1045248 177592 pts/2  Sl+  Mar05  23:18 python scripts/munge_fah_data_parallel.py
server    4011  1.7  0.0 1070764 202980 pts/2  Sl+  Mar05  25:52 python scripts/munge_fah_data_parallel.py
server    4012  1.5  0.0 1092928 225484 pts/2  Sl+  Mar05  23:16 python scripts/munge_fah_data_parallel.py
server    4014  2.1  0.0 1128196 260920 pts/2  Sl+  Mar05  32:02 python scripts/munge_fah_data_parallel.py
server    4015  1.8  0.0 1072684 205312 pts/2  Sl+  Mar05  26:43 python scripts/munge_fah_data_parallel.py
server    4016  1.7  0.0 1071136 203656 pts/2  Sl+  Mar05  25:19 python scripts/munge_fah_data_parallel.py
server    4017  1.7  0.0 1072288 204720 pts/2  Sl+  Mar05  25:45 python scripts/munge_fah_data_parallel.py
root      4077  0.0  0.0 205956  9496 ?        S    Mar01   0:23 /usr/bin/python -s /usr/sbin/osad --pid-file /var/run/osad.pid
server    7678  1.7  0.0 1272640 208872 pts/2  Sl+  Mar05  25:34 python scripts/munge_fah_data_parallel.py
server    7679  1.7  0.0 1308740 244968 pts/2  Sl+  Mar05  25:52 python scripts/munge_fah_data_parallel.py
server    7680  1.8  0.0 1300404 236620 pts/2  Sl+  Mar05  26:06 python scripts/munge_fah_data_parallel.py
server    7681  1.7  0.1 1337600 274132 pts/2  Sl+  Mar05  25:49 python scripts/munge_fah_data_parallel.py
server    7682  1.7  0.0 1282828 218876 pts/2  Sl+  Mar05  25:52 python scripts/munge_fah_data_parallel.py
server    7683  1.7  0.1 1330816 267356 pts/2  Sl+  Mar05  25:52 python scripts/munge_fah_data_parallel.py
server    7684  1.7  0.0 1300216 236424 pts/2  Sl+  Mar05  25:38 python scripts/munge_fah_data_parallel.py
server    7685  1.7  0.0 1275088 211172 pts/2  Sl+  Mar05  25:27 python scripts/munge_fah_data_parallel.py
server    7686  1.7  0.0 1285176 221404 pts/2  Sl+  Mar05  25:48 python scripts/munge_fah_data_parallel.py
server    7687  1.7  0.0 1292628 228840 pts/2  Sl+  Mar05  25:45 python scripts/munge_fah_data_parallel.py
server    7688  1.7  0.0 1293104 229292 pts/2  Sl+  Mar05  25:25 python scripts/munge_fah_data_parallel.py
server    7689  1.7  0.0 1276316 212396 pts/2  Sl+  Mar05  25:37 python scripts/munge_fah_data_parallel.py
server    7690  1.7  0.0 1295628 232160 pts/2  Sl+  Mar05  25:46 python scripts/munge_fah_data_parallel.py
server    7691  1.7  0.0 1318016 254548 pts/2  Sl+  Mar05  25:49 python scripts/munge_fah_data_parallel.py
server    7692  2.0  0.0 1321140 257672 pts/2  Sl+  Mar05  28:50 python scripts/munge_fah_data_parallel.py
server    7693  1.7  0.0 1269232 205440 pts/2  Sl+  Mar05  25:26 python scripts/munge_fah_data_parallel.py
server    8063  1.5  0.0 2833624 197200 pts/2  Sl+  01:17  16:34 python scripts/munge_fah_data_parallel.py
server    8064  1.6  0.0 2833512 196980 pts/2  Sl+  01:17  16:51 python scripts/munge_fah_data_parallel.py
server    8065  1.6  0.0 2833368 196944 pts/2  Sl+  01:17  16:52 python scripts/munge_fah_data_parallel.py
server    8066  1.5  0.0 2833404 197056 pts/2  Sl+  01:17  16:22 python scripts/munge_fah_data_parallel.py
server    8067  1.5  0.0 2833736 197184 pts/2  Sl+  01:17  16:27 python scripts/munge_fah_data_parallel.py
server    8068  1.5  0.0 2833744 197192 pts/2  Sl+  01:17  16:31 python scripts/munge_fah_data_parallel.py
server    8069  1.4  0.0 2833092 196536 pts/2  Sl+  01:17  15:35 python scripts/munge_fah_data_parallel.py
server    8070  1.6  0.0 2833704 197152 pts/2  Sl+  01:17  16:46 python scripts/munge_fah_data_parallel.py
server    8071  1.3  0.0 2833736 197200 pts/2  Sl+  01:17  13:49 python scripts/munge_fah_data_parallel.py
server    8072  1.5  0.0 2833608 197052 pts/2  Sl+  01:17  16:38 python scripts/munge_fah_data_parallel.py
server    8073  1.5  0.0 2833700 197160 pts/2  Sl+  01:17  16:40 python scripts/munge_fah_data_parallel.py
server    8074  1.5  0.0 2840904 204528 pts/2  Sl+  01:17  16:38 python scripts/munge_fah_data_parallel.py
server    8075  1.5  0.0 2833588 197164 pts/2  Sl+  01:17  16:40 python scripts/munge_fah_data_parallel.py
server    8076  1.5  0.0 2833340 196920 pts/2  Sl+  01:17  16:37 python scripts/munge_fah_data_parallel.py
server    8077  1.5  0.0 2833724 197172 pts/2  Sl+  01:17  16:37 python scripts/munge_fah_data_parallel.py
server    8078  1.6  0.0 2833712 197156 pts/2  Sl+  01:17  16:49 python scripts/munge_fah_data_parallel.py
server   11172 98.4  0.0 3041824 209200 pts/2  Sl+  01:38 1006:42 python scripts/munge_fah_data_parallel.py
server   11173 98.5  0.1 3105724 273156 pts/2  Sl+  01:38 1007:30 python scripts/munge_fah_data_parallel.py
server   11174 98.9  0.0 3047860 215256 pts/2  Sl+  01:38 1011:15 python scripts/munge_fah_data_parallel.py
server   11175 98.4  0.0 3044124 211484 pts/2  Sl+  01:38 1006:39 python scripts/munge_fah_data_parallel.py
server   11176 99.4  0.1 3104544 271956 pts/2  Rl+  01:38 1016:58 python scripts/munge_fah_data_parallel.py
server   11177 99.2  0.1 3103288 270700 pts/2  Sl+  01:38 1014:41 python scripts/munge_fah_data_parallel.py
server   11178 99.0  0.1 3108832 276248 pts/2  Sl+  01:38 1012:20 python scripts/munge_fah_data_parallel.py
server   11179 98.8  0.0 3088436 255620 pts/2  Sl+  01:38 1010:42 python scripts/munge_fah_data_parallel.py
server   11180 98.5  0.0 3042332 209728 pts/2  Sl+  01:38 1006:58 python scripts/munge_fah_data_parallel.py
server   11181 99.5  0.1 3105416 272756 pts/2  Rl+  01:38 1017:05 python scripts/munge_fah_data_parallel.py
server   11182 99.4  0.1 3102796 270180 pts/2  Rl+  01:38 1016:57 python scripts/munge_fah_data_parallel.py
server   11183 98.7  0.1 3104076 271556 pts/2  Sl+  01:38 1009:04 python scripts/munge_fah_data_parallel.py
server   11184 98.8  0.0 3071572 239132 pts/2  Sl+  01:38 1010:49 python scripts/munge_fah_data_parallel.py
server   11185 98.4  0.0 3044320 211712 pts/2  Sl+  01:38 1006:07 python scripts/munge_fah_data_parallel.py
server   11186 99.0  0.1 3108496 275904 pts/2  Sl+  01:38 1012:14 python scripts/munge_fah_data_parallel.py
server   11187 99.5  0.0 3091052 258472 pts/2  Rl+  01:38 1017:01 python scripts/munge_fah_data_parallel.py
server   17278  1.8  0.0 1477932 219468 pts/2  Sl+  Mar05  26:02 python scripts/munge_fah_data_parallel.py
server   17279  1.9  0.0 1478372 219748 pts/2  Sl+  Mar05  27:02 python scripts/munge_fah_data_parallel.py
server   17280  1.7  0.0 1480480 221924 pts/2  Sl+  Mar05  24:49 python scripts/munge_fah_data_parallel.py
server   17281  1.8  0.1 1537432 278576 pts/2  Sl+  Mar05  25:09 python scripts/munge_fah_data_parallel.py
server   17282  1.7  0.1 1536932 278296 pts/2  Sl+  Mar05  25:02 python scripts/munge_fah_data_parallel.py
server   17283  1.8  0.0 1478136 219684 pts/2  Sl+  Mar05  25:58 python scripts/munge_fah_data_parallel.py
server   17284  1.8  0.1 1557156 298544 pts/2  Sl+  Mar05  25:59 python scripts/munge_fah_data_parallel.py
server   17285  1.8  0.0 1477584 219112 pts/2  Sl+  Mar05  25:17 python scripts/munge_fah_data_parallel.py
server   17286  1.7  0.0 1478208 219372 pts/2  Sl+  Mar05  24:56 python scripts/munge_fah_data_parallel.py
server   17287  2.1  0.1 1539100 280744 pts/2  Sl+  Mar05  29:26 python scripts/munge_fah_data_parallel.py
server   17288  1.7  0.0 1478864 220416 pts/2  Sl+  Mar05  24:23 python scripts/munge_fah_data_parallel.py
server   17289  1.7  0.0 1479892 221048 pts/2  Sl+  Mar05  24:31 python scripts/munge_fah_data_parallel.py
server   17290  1.9  0.0 1469212 210876 pts/2  Sl+  Mar05  27:43 python scripts/munge_fah_data_parallel.py
server   17291  1.7  0.0 1479160 220288 pts/2  Sl+  Mar05  25:02 python scripts/munge_fah_data_parallel.py
server   17292  1.9  0.0 1480544 221676 pts/2  Sl+  Mar05  27:21 python scripts/munge_fah_data_parallel.py
server   17293  1.7  0.0 1477272 218384 pts/2  Sl+  Mar05  24:59 python scripts/munge_fah_data_parallel.py
server   20326  1.4  0.1 1721736 264584 pts/2  Sl+  Mar05  19:51 python scripts/munge_fah_data_parallel.py
server   20327  1.3  0.0 1701368 244204 pts/2  Sl+  Mar05  18:34 python scripts/munge_fah_data_parallel.py
server   20328  1.3  0.0 1693204 236080 pts/2  Sl+  Mar05  18:26 python scripts/munge_fah_data_parallel.py
server   20329  1.3  0.0 1685800 228680 pts/2  Sl+  Mar05  18:53 python scripts/munge_fah_data_parallel.py
server   20330  1.4  0.0 1700328 243268 pts/2  Sl+  Mar05  20:16 python scripts/munge_fah_data_parallel.py
server   20331  1.3  0.0 1709288 252184 pts/2  Sl+  Mar05  18:33 python scripts/munge_fah_data_parallel.py
server   20332  1.3  0.0 1684236 227120 pts/2  Sl+  Mar05  19:08 python scripts/munge_fah_data_parallel.py
server   20333  1.3  0.0 1703248 246132 pts/2  Sl+  Mar05  18:55 python scripts/munge_fah_data_parallel.py
server   20334  1.4  0.0 1690012 232904 pts/2  Sl+  Mar05  19:08 python scripts/munge_fah_data_parallel.py
server   20335  1.4  0.0 1690408 233252 pts/2  Sl+  Mar05  19:09 python scripts/munge_fah_data_parallel.py
server   20336  1.3  0.0 1709476 252388 pts/2  Sl+  Mar05  18:30 python scripts/munge_fah_data_parallel.py
server   20337  1.3  0.0 1712448 255320 pts/2  Sl+  Mar05  18:18 python scripts/munge_fah_data_parallel.py
server   20338  1.3  0.0 1704372 247252 pts/2  Sl+  Mar05  18:24 python scripts/munge_fah_data_parallel.py
server   20339  1.3  0.0 1699160 242000 pts/2  Sl+  Mar05  18:40 python scripts/munge_fah_data_parallel.py
server   20340  1.4  0.0 1702460 245336 pts/2  Sl+  Mar05  19:11 python scripts/munge_fah_data_parallel.py
server   20341  1.4  0.0 1701524 244396 pts/2  Sl+  Mar05  20:22 python scripts/munge_fah_data_parallel.py
server   26193  1.7  0.0 1883380 229704 pts/2  Sl+  Mar05  23:33 python scripts/munge_fah_data_parallel.py
server   26194  1.4  0.0 1812140 158436 pts/2  Sl+  Mar05  19:36 python scripts/munge_fah_data_parallel.py
server   26195  1.8  0.0 1869884 216188 pts/2  Sl+  Mar05  25:22 python scripts/munge_fah_data_parallel.py
server   26196  1.2  0.0 1810116 156572 pts/2  Sl+  Mar05  16:21 python scripts/munge_fah_data_parallel.py
server   26197  1.5  0.0 1877644 223968 pts/2  Sl+  Mar05  20:27 python scripts/munge_fah_data_parallel.py
server   26198  1.3  0.0 1890612 236716 pts/2  Sl+  Mar05  18:41 python scripts/munge_fah_data_parallel.py
server   26199  1.2  0.0 1832132 178428 pts/2  Sl+  Mar05  16:36 python scripts/munge_fah_data_parallel.py
server   26200  1.2  0.0 1834072 180376 pts/2  Sl+  Mar05  16:31 python scripts/munge_fah_data_parallel.py
server   26201  1.2  0.0 1836188 182488 pts/2  Sl+  Mar05  16:36 python scripts/munge_fah_data_parallel.py
server   26202  1.3  0.0 1837408 183704 pts/2  Sl+  Mar05  18:19 python scripts/munge_fah_data_parallel.py
server   26203  1.4  0.0 1837384 183688 pts/2  Sl+  Mar05  18:49 python scripts/munge_fah_data_parallel.py
server   26204  1.4  0.0 1885584 231908 pts/2  Sl+  Mar05  19:50 python scripts/munge_fah_data_parallel.py
server   26205  1.5  0.0 1834780 181088 pts/2  Sl+  Mar05  21:21 python scripts/munge_fah_data_parallel.py
server   26206  1.2  0.0 1837000 183404 pts/2  Sl+  Mar05  16:07 python scripts/munge_fah_data_parallel.py
server   26207  1.7  0.0 1870032 216340 pts/2  Sl+  Mar05  23:02 python scripts/munge_fah_data_parallel.py
server   26208  1.5  0.0 1840040 186352 pts/2  Sl+  Mar05  21:09 python scripts/munge_fah_data_parallel.py
server   27094  1.2  0.7 3846420 1994892 pts/2 Sl+  Mar05  16:33 python scripts/munge_fah_data_parallel.py
server   27095  1.2  0.5 3231804 1381336 pts/2 Sl+  Mar05  16:05 python scripts/munge_fah_data_parallel.py
server   27096  1.2  0.0 2063120 212792 pts/2  Sl+  Mar05  16:39 python scripts/munge_fah_data_parallel.py
server   27097  1.2  0.5 3289964 1439472 pts/2 Sl+  Mar05  16:04 python scripts/munge_fah_data_parallel.py
server   27098  1.4  0.1 3845104 301288 pts/2  Sl+  Mar05  18:49 python scripts/munge_fah_data_parallel.py
server   27099  1.1  0.0 2073684 223376 pts/2  Sl+  Mar05  15:29 python scripts/munge_fah_data_parallel.py
server   27100  1.2  0.0 2064280 213964 pts/2  Sl+  Mar05  16:54 python scripts/munge_fah_data_parallel.py
server   27101  1.3  0.0 2102388 252064 pts/2  Sl+  Mar05  18:40 python scripts/munge_fah_data_parallel.py
server   27102  1.2  0.0 2089488 238960 pts/2  Sl+  Mar05  16:45 python scripts/munge_fah_data_parallel.py
server   27103  1.2  0.0 2031412 181088 pts/2  Sl+  Mar05  16:24 python scripts/munge_fah_data_parallel.py
server   27104  1.2  0.3 3624708 1049548 pts/2 Sl+  Mar05  16:26 python scripts/munge_fah_data_parallel.py
server   27105  1.2  0.0 2053292 202972 pts/2  Sl+  Mar05  16:55 python scripts/munge_fah_data_parallel.py
server   27106  1.0  0.9 4370772 2519744 pts/2 Sl+  Mar05  14:32 python scripts/munge_fah_data_parallel.py
server   27107  1.2  0.0 2043076 192916 pts/2  Sl+  Mar05  17:17 python scripts/munge_fah_data_parallel.py
server   27108  1.2  0.0 2028504 178220 pts/2  Sl+  Mar05  17:06 python scripts/munge_fah_data_parallel.py
server   27109  1.1  0.1 2224816 374408 pts/2  Sl+  Mar05  15:21 python scripts/munge_fah_data_parallel.py
server   27223  0.4  0.0 2270008 223160 pts/2  Sl+  Mar05   6:13 python scripts/munge_fah_data_parallel.py
server   27224  0.4  0.0 2273880 227032 pts/2  Sl+  Mar05   6:10 python scripts/munge_fah_data_parallel.py
server   27225  0.4  0.0 2270412 223280 pts/2  Sl+  Mar05   5:54 python scripts/munge_fah_data_parallel.py
server   27226  0.5  0.0 2278600 231748 pts/2  Sl+  Mar05   7:22 python scripts/munge_fah_data_parallel.py
server   27227  0.5  0.1 2344860 297824 pts/2  Sl+  Mar05   6:55 python scripts/munge_fah_data_parallel.py
server   27228  0.6  0.0 2270788 223940 pts/2  Sl+  Mar05   8:42 python scripts/munge_fah_data_parallel.py
server   27229  0.4  0.0 2279592 231936 pts/2  Sl+  Mar05   6:05 python scripts/munge_fah_data_parallel.py
server   27230  0.5  0.0 2275256 227600 pts/2  Sl+  Mar05   7:25 python scripts/munge_fah_data_parallel.py
server   27231  0.5  0.0 2278296 231196 pts/2  Sl+  Mar05   7:16 python scripts/munge_fah_data_parallel.py
server   27232  0.8  0.0 2271308 224460 pts/2  Sl+  Mar05  11:47 python scripts/munge_fah_data_parallel.py
server   27233  0.6  0.0 2272084 225232 pts/2  Sl+  Mar05   8:17 python scripts/munge_fah_data_parallel.py
server   27234  0.5  0.0 2272752 225648 pts/2  Sl+  Mar05   7:58 python scripts/munge_fah_data_parallel.py
server   27235  0.7  0.1 2354816 308028 pts/2  Sl+  Mar05   9:56 python scripts/munge_fah_data_parallel.py
server   27236  0.5  0.0 2278892 232044 pts/2  Sl+  Mar05   7:18 python scripts/munge_fah_data_parallel.py
server   27237  0.6  0.0 2269560 222712 pts/2  Sl+  Mar05   9:14 python scripts/munge_fah_data_parallel.py
server   27238  0.5  0.1 2379544 331956 pts/2  Sl+  Mar05   7:16 python scripts/munge_fah_data_parallel.py
server   29398  1.1  0.0 2479236 235556 pts/2  Sl+  Mar05  15:30 python scripts/munge_fah_data_parallel.py
server   29399  1.1  0.0 2462796 218812 pts/2  Sl+  Mar05  15:26 python scripts/munge_fah_data_parallel.py
server   29400  1.1  0.0 2462768 218996 pts/2  Sl+  Mar05  15:42 python scripts/munge_fah_data_parallel.py
server   29401  1.1  0.0 2462936 219124 pts/2  Sl+  Mar05  15:25 python scripts/munge_fah_data_parallel.py
server   29402  1.1  0.0 2471908 228204 pts/2  Sl+  Mar05  14:33 python scripts/munge_fah_data_parallel.py
server   29403  1.1  0.0 2462984 219132 pts/2  Sl+  Mar05  15:40 python scripts/munge_fah_data_parallel.py
server   29404  1.1  0.0 2462424 218460 pts/2  Sl+  Mar05  15:28 python scripts/munge_fah_data_parallel.py
server   29405  1.1  0.0 2461688 218128 pts/2  Sl+  Mar05  15:27 python scripts/munge_fah_data_parallel.py
server   29406  1.1  0.0 2470880 227200 pts/2  Sl+  Mar05  15:28 python scripts/munge_fah_data_parallel.py
server   29407  1.1  0.0 2463736 220040 pts/2  Sl+  Mar05  15:14 python scripts/munge_fah_data_parallel.py
server   29408  1.1  0.0 2479196 235512 pts/2  Sl+  Mar05  15:36 python scripts/munge_fah_data_parallel.py
server   29409  1.1  0.0 2462608 218700 pts/2  Sl+  Mar05  15:42 python scripts/munge_fah_data_parallel.py
server   29410  1.1  0.0 2461208 217660 pts/2  Sl+  Mar05  15:40 python scripts/munge_fah_data_parallel.py
server   29411  1.1  0.0 2462808 219008 pts/2  Sl+  Mar05  15:06 python scripts/munge_fah_data_parallel.py
server   29412  0.9  0.0 2455472 211964 pts/2  Sl+  Mar05  13:07 python scripts/munge_fah_data_parallel.py
server   29413  1.1  0.0 2461120 217576 pts/2  Sl+  Mar05  15:29 python scripts/munge_fah_data_parallel.py
server   35491 18.1  0.0 2689844 250080 pts/2  Sl+  Mar05 234:10 python scripts/munge_fah_data_parallel.py
server   35492 18.5  0.0 2691048 251248 pts/2  Sl+  Mar05 238:49 python scripts/munge_fah_data_parallel.py
server   35493 15.5  0.0 2692812 252816 pts/2  Sl+  Mar05 200:27 python scripts/munge_fah_data_parallel.py
server   35494 17.3  0.0 2694780 254796 pts/2  Sl+  Mar05 224:02 python scripts/munge_fah_data_parallel.py
server   35495 18.8  0.0 2689688 249920 pts/2  Sl+  Mar05 243:18 python scripts/munge_fah_data_parallel.py
server   35496 17.3  0.0 2689296 249544 pts/2  Sl+  Mar05 224:02 python scripts/munge_fah_data_parallel.py
server   35497 18.7  0.0 2688980 249332 pts/2  Sl+  Mar05 242:10 python scripts/munge_fah_data_parallel.py
server   35498 18.6  0.0 2682528 242760 pts/2  Sl+  Mar05 240:14 python scripts/munge_fah_data_parallel.py
server   35499 17.5  0.0 2689508 249748 pts/2  Sl+  Mar05 226:12 python scripts/munge_fah_data_parallel.py
server   35500 18.9  0.0 2683200 243436 pts/2  Sl+  Mar05 244:10 python scripts/munge_fah_data_parallel.py
server   35501 18.7  0.0 2688964 249208 pts/2  Sl+  Mar05 241:25 python scripts/munge_fah_data_parallel.py
server   35502 18.6  0.0 2688764 248996 pts/2  Sl+  Mar05 239:59 python scripts/munge_fah_data_parallel.py
server   35503 18.6  0.0 2681480 241820 pts/2  Sl+  Mar05 240:32 python scripts/munge_fah_data_parallel.py
server   35504 17.6  0.0 2692360 252596 pts/2  Sl+  Mar05 227:26 python scripts/munge_fah_data_parallel.py
server   35505 18.3  0.0 2692180 252412 pts/2  Sl+  Mar05 236:40 python scripts/munge_fah_data_parallel.py
server   35506 12.3  0.0 2681512 241744 pts/2  Sl+  Mar05 159:19 python scripts/munge_fah_data_parallel.py
server   41492  5.9  0.0 3522420 34276 pts/2   Sl+  Mar05  90:59 python scripts/munge_fah_data_parallel.py
server   41506  1.0  0.0 655416 55796 pts/2    Sl+  Mar05  15:50 python scripts/munge_fah_data_parallel.py
server   41507  1.2  0.0 658100 111628 pts/2   Sl+  Mar05  18:27 python scripts/munge_fah_data_parallel.py
server   41508  1.3  0.0 654448 82388 pts/2    Sl+  Mar05  19:49 python scripts/munge_fah_data_parallel.py
server   41509  1.0  0.0 700160 78912 pts/2    Sl+  Mar05  16:04 python scripts/munge_fah_data_parallel.py
server   41510  1.0  0.0 654476 100404 pts/2   Sl+  Mar05  16:27 python scripts/munge_fah_data_parallel.py
server   41511  1.2  0.0 693644 153608 pts/2   Sl+  Mar05  19:13 python scripts/munge_fah_data_parallel.py
server   41512  1.0  0.0 653876 131936 pts/2   Sl+  Mar05  15:36 python scripts/munge_fah_data_parallel.py
server   41513  1.2  0.0 690576 149856 pts/2   Sl+  Mar05  18:19 python scripts/munge_fah_data_parallel.py
server   41514  1.1  0.0 630932 113380 pts/2   Sl+  Mar05  17:42 python scripts/munge_fah_data_parallel.py
server   41515  0.9  0.0 654028 128292 pts/2   Sl+  Mar05  15:00 python scripts/munge_fah_data_parallel.py
server   41516  1.1  0.0 653276 124996 pts/2   Sl+  Mar05  17:49 python scripts/munge_fah_data_parallel.py
server   41517  1.1  0.0 666628 136676 pts/2   Sl+  Mar05  17:03 python scripts/munge_fah_data_parallel.py
server   41518  1.0  0.0 658844 137908 pts/2   Sl+  Mar05  15:59 python scripts/munge_fah_data_parallel.py
server   41519  1.1  0.0 632264 113816 pts/2   Sl+  Mar05  17:09 python scripts/munge_fah_data_parallel.py
server   41520  1.0  0.0 629700 97100 pts/2    Sl+  Mar05  15:56 python scripts/munge_fah_data_parallel.py
server   41521  1.1  0.0 689932 114904 pts/2   Sl+  Mar05  17:55 python scripts/munge_fah_data_parallel.py
server   44621  1.2  0.0 869436 127608 pts/2   Sl+  Mar05  18:19 python scripts/munge_fah_data_parallel.py
server   44622  1.2  0.0 918464 171636 pts/2   Sl+  Mar05  19:10 python scripts/munge_fah_data_parallel.py
server   44623  1.2  0.0 963380 246468 pts/2   Sl+  Mar05  18:26 python scripts/munge_fah_data_parallel.py
server   44624  1.2  0.0 861556 135816 pts/2   Sl+  Mar05  18:46 python scripts/munge_fah_data_parallel.py
server   44625  1.3  0.0 917684 173152 pts/2   Sl+  Mar05  19:51 python scripts/munge_fah_data_parallel.py
server   44626  1.2  0.0 941892 223248 pts/2   Sl+  Mar05  18:19 python scripts/munge_fah_data_parallel.py
server   44627  1.2  0.0 910728 208584 pts/2   Sl+  Mar05  18:52 python scripts/munge_fah_data_parallel.py
server   44628  1.3  0.0 937112 229232 pts/2   Sl+  Mar05  20:55 python scripts/munge_fah_data_parallel.py
server   44629  1.3  0.0 960372 249244 pts/2   Sl+  Mar05  19:53 python scripts/munge_fah_data_parallel.py
server   44630  1.2  0.0 861560 161080 pts/2   Sl+  Mar05  18:28 python scripts/munge_fah_data_parallel.py
server   44631  1.2  0.0 889292 182700 pts/2   Sl+  Mar05  18:57 python scripts/munge_fah_data_parallel.py
server   44632  1.2  0.0 931912 189788 pts/2   Sl+  Mar05  19:16 python scripts/munge_fah_data_parallel.py
server   44633  1.2  0.0 911896 171404 pts/2   Sl+  Mar05  18:55 python scripts/munge_fah_data_parallel.py
server   44634  1.2  0.0 868900 141580 pts/2   Sl+  Mar05  18:04 python scripts/munge_fah_data_parallel.py
server   44635  1.2  0.0 917272 168080 pts/2   Sl+  Mar05  18:50 python scripts/munge_fah_data_parallel.py
server   44636  1.2  0.0 925240 221672 pts/2   Sl+  Mar05  19:05 python scripts/munge_fah_data_parallel.py
-bash-4.1$ ps xauwww | grep python | wc -l
211

After screen -r and a keyboard terminate:

[server@plfah2 FAHMunge]$ ps xauwww | grep python
server    3312  0.0  0.0 103308   876 pts/2    S+   18:43   0:00 grep python
root      4077  0.0  0.0 205956  9496 ?        S    Mar01   0:23 /usr/bin/python -s /usr/sbin/osad --pid-file /var/run/osad.pid

Will investigate. Restarting for now.

jchodera commented 8 years ago

Seems relevant: http://stackoverflow.com/questions/30506489/python-multiprocessing-leading-to-many-zombie-processes

jchodera commented 8 years ago

Testing a fix.

jchodera commented 8 years ago

Seems to be fixed now. I'll have to copy the changes (which I made in situ on plfah2) back to merge them into the repo.

jchodera commented 8 years ago

Fixed (hopefully!) by #22