hoffmangroup / segway

Application for semi-automated genomic annotation.
http://segway.hoffmanlab.org/
GNU General Public License v2.0
13 stars 7 forks source link

segway on SGE does not submit job arguments longer than 1024 chars #72

Closed EricR86 closed 8 years ago

EricR86 commented 8 years ago

Original report (BitBucket issue) by Rachel Chan (Bitbucket: rcwchan).


This is a DRMAA issue that segway is impacted by. Upon calling

#!python

> /mnt/work1/users/home2/rachelc/segway/segway/run.py(1874)queue_gmtk()
(Pdb) l
1869             job_tmpl.jobName = job_name
1870             job_tmpl.remoteCommand = ENV_CMD
1871 ->         job_tmpl.args = map(str, args)

__set__ from drmaa/helpers.py is called, and does the following:

#!python

181          def __set__(self, instance, value):
182              c(drmaa_set_vector_attribute, instance,
183  ->              self.name, string_vector(value))

string_vector(value) has no truncating effect on the passed arguments. It is drmaa_set_vector_attribute that appears to have a buffer overflow of some sort and caps all arguments at 1024 chars.

This seems like a pre-existing issue that was only more easily isolated with minibatch (presumably because the list of windows was random in length, and you were unlikely (rather than likely) to hit a partial window that already existed (for example: cutting the window number '1000' in half, without minibatch, 10 would likely be a valid window, and so ending the argument there would throw no error, but with minibatch, 10 is unlikely to be a valid window due to its random nature).

The bug can be easily recreated and shown with pdb with the following simple test:

#!python

import drmaa

def main():
    session = drmaa.Session()
    session.initialize()
    job_template = session.createJobTemplate()
    args = ["", '5,7,9,14,17,22,31,32,36,37,38,48,50,55,60,63,65,68,70,72,78,79,82,85,86,88,89,90,91,92,95,96,99,100,105,106,111,114,118,119,120,123,126,127,128,131,132,135,136,138,145,148,150,151,154,155,156,164,168,172,174,182,183,184,187,192,193,195,197,198,199,202,203,207,208,212,216,218,220,222,225,226,230,233,239,242,245,251,257,258,259,260,261,265,266,268,279,280,283,284,286,287,288,289,294,298,301,303,304,306,314,318,325,329,330,332,334,337,344,346,352,364,366,369,370,373,379,381,382,384,390,391,392,394,402,403,407,408,414,419,425,431,433,434,439,440,444,445,446,459,460,461,466,468,469,470,473,476,480,486,487,494,495,497,501,515,516,517,522,528,529,533,535,536,538,539,543,546,547,550,551,552,553,562,565,568,569,570,572,575,578,580,582,583,587,588,590,592,598,601,603,606,607,608,613,620,623,626,627,628,629,630,631,634,639,647,650,654,655,658,669,672,673,675,676,681,683,694,697,698,703,706,709,710,712,716,719,725,733,736,741,743,745,748,752,754,756,757,761,765,766,770,771,772,773,774,776,778,779,780,786,788,792,793,796,803,805,808,809,811,812,813,814,817,820,823,829,831,834,840,842,845,850,851,854,856,857,858,864,867,870,871,875,876,878,879,881,885,890,894,895,897,901,902,903,904,905,912,915,921,924,925,927,932,934,936,937,939,941,942,948,951,952,954,955,960,961,965,971,973,977,978,979,980,984,985,995,996,998,1000,1001,1004,1007,1009,1011,1012,1015,1016,1017,1018,1021,1025,1026,1027,1031,1035,1040,1046,1047,1048,1051,1052,1053,1056,1057,1058,1061,1065,1066,1071,1072,1085,1087,1088,1089,1092,1093,1094,1095,1098,1102,1103,1106,1108,1111,1112,1117,1118,1123,1127,1133,1135,1136,1143,1146,1147,1151,1153,1154,1156,1160,1161,1163,1164,1165,1172,1176,1179,1180,1181,1183,1184,1185,1188,1189,1190,1191,1192,1193,1194,1200,1201,1204,1206,1207,1209,1210,1211,1212,1216,1219,1222,1226,1237,1240,1243,1244,1245,1248,1250,1253,1255,1256,1257,1261,1268,1269,1276,1281,1283,1286,1290,1292,1295,1296,1299,1301,1302,1303,1307,1310,1321,1322,1329,1332,1335,1337,1342,1343,1344,1348,1350,1353,1354,1357,1358,1359,1360,1372,1373,1376,1377,1384,1385,1390,1391,1393,1394,1398,1399,1401,1408,1409,1413,1414,1417,1419,1425,1431,1435,1436,1439,1440,1443,1444,1446,1447,1451,1452,1459,1460,1461,1465,1466,1469,1470,1471,1475,1476,1477,1479,1481,1485,1486,1487,1489,1491,1492,1495,1499,1500,1503,1505,1508,1513,1516,1528,1531,1532,1538,1547,1549,1550,1552,1554,1556,1561,1564,1566,1569,1573,1584,1585,1586,1590,1591,1593,1595,1596,1597,1599,1601,1606,1610,1611,1613,1614,1617,1621,1623,1626,1628,1630,1639,1640,1644,1647,1648,1651,1652,1655,1659,1660,1664,1668,1674,1679,1681,1684,1691']
    job_template.args = map(str, args)
    print job_template.args
    session.exit()

if __name__=='__main__':
    main()

shows clearly that job_template.args is truncated by set_vector_attribute. It can also be speculated that the unicode memory issue in issue 60 (#60) could be caused by this buffer overflow as well.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


It is worth noting that c(drmaa_set_vector_attribute ... seems to set the python attribute to unicode from the c itself and not in the drmaa python code.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Arguments in bash also do not seem to have this limit. You can, for example, write a script to echo it's first argument and pass in an argument longer than 1024 characters.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


This is also seems not to be a limit on qsub itself in SGE. I can submit jobs with arguments longer than 1024 characters and get the expected output (e.g. by echoing the first argument)

EricR86 commented 8 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).


DRMAA_python issue submitted here.

EricR86 commented 8 years ago

Original comment by Rachel Chan (Bitbucket: rcwchan).


This bug also occurs on h4h (PBS Torque system). I've emailed the issue to the support email for pbs-drmaa.

EricR86 commented 8 years ago

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Resolved in Pull Request #55

The DRMAA issues themselves are still outstanding but this is a suitable workaround that should completely ignore this corner case.