arq5x / bedtools2

bedtools - the swiss army knife for genome arithmetic
MIT License
943 stars 287 forks source link

Pybedtools------IndexError: list index out of range #515

Closed ShixiangWang closed 7 years ago

ShixiangWang commented 7 years ago

I encounter an index error when I run a software's source code.

 Traceback (most recent call last):
  File "gwava_annotate.py", line 297, in <module>
    df = annotate(vf)
  File "gwava_annotate.py", line 255, in annotate
    encode_feats(vf, ENCODE_FEATS),
  File "gwava_annotate.py", line 41, in encode_feats
    for entry in annots:
  File "pybedtools/cbedtools.pyx", line 787, in pybedtools.cbedtools.IntervalIterator.__next__ (pybedtools/cbedtools.cxx:11123)
  File "pybedtools/cbedtools.pyx", line 652, in pybedtools.cbedtools.create_interval_from_list (pybedtools/cbedtools.cxx:9208)
IndexError: list index out of range

I check line41, and figure out the content of annots is

DNase:26039,H3K4me3:13995,POLR2A:9215,H3K4me2:8024,H3K9ac:6869,H3K27ac:5904,H3K27me3:4816,H3K4me1:4622,FAIRE:4462,H2AFZ:3052,CTCF:2356,H3K36me3:1766,H3K79me2:1469,TAF1:1264,TBP:798,YY1:689,NFKB1:664,MYC:578,MAX:565,USF1:496,SP1:415,HEY1:409,SIN3A:405,ELF1:396,EP300:383,E2F1:377,H4K20me1:369,JUND:330,E2F6:315,EGR1:299,CEBPB:279,PAX5:251,CHD2:227,FOXA1:227,RAD21:218,TCF12:206,E2F4:205,STAT3:202,REST:200,GABPA:197,USF2:197,POU2F2:194,SLC22A2:194,TCF7L2:193,MXI1:183,NR3C1:180,HNF4A:177,GTF2F1:174,GATA1:169,EBF1:167,SPI1:167,HDAC2:163,TRIM28:154,CCNT2:149,HMGN3:141,GATA2:132,FOSL2:130,FOS:129,IRF4:116,ZNF263:116,TFAP2C:115,POLR2A_elongating:114,STAT1:109,RFX5:108,NRF1:107,H3K9me3:99,POLR3A:98,BCLAF1:97,FOXA2:97,IRF1:97,ELK4:87,HNF4G:87,SMC3:87,TFAP2A:87,ZBTB7A:75,TAF7:73,SRF:72,ZEB1:70,ETS1:69,JUN:66,BCL3:64,BRCA1:64,NR2C2:64,SMARCB1:64,BATF:60,GTF2B:58,MEF2A:57,SREBF1:56,TAL1:55,RXRA:54,NFYA:49,HSF1:47,BHLHE40:46,PBX3:44,NFYB:42,RDBP:42,SIX5:42,STAT2:39,ZNF143:38,ATF3:36,MAFK:34,IRF3:31,ZBTB33:29,SETDB1:24,CTCFL:23,SP2:23,SREBF2:21,CTBP2:16,GATA3:16,MEF2_complex:15,NFE2:15,SMARCC1:15,SUZ12:15,JUNB:14,SMARCA4:14,BDP1:13,ERALPHAA:10,FOSL1:10,H3K9me1:10,SIRT6:10,MAFF:9,NANOG:8,SMARCC2:8,BCL11A:4,BRF2:4,PPARGC1A:4,THAP1:4,Eralphaa:2,NR4A1:2,ESRRA:1,FAM48A:1,GTF3C2:1,POU5F1:1,PRDM1:1,ZNF274:1

It just stop when run for entry in annots:

def encode_feats(vf, af):
    v = BedTool(vf)
    feats = BedTool(af)
    cols = open(af+'.cols', 'r').readline().strip().split(',')
    intersection = feats.intersect(v, wb=True)
    sort_cmd = 'sort -k1,1 -k2,2n -k3,3n %s -o %s' % (intersection.fn, intersection.fn)
    call(sort_cmd, shell=True)
    annots = intersection.groupby(g=[9,10,11,12], c=6, ops='freqdesc')
    results = {}
    for entry in annots:            # Here
        fs = entry[4].strip(',').split(',')
        results[entry.name] = Series({e[0]: int(e[1]) for e in [f.split(':') for f in fs]})
    df = DataFrame(results, index = cols)
    # transpose to turn feature types into columns, and turn all the NAs in to 0s
    return df.T.fillna(0)
daler commented 7 years ago

@ShixiangWang pybedtools is actually maintained elsewhere. Can you please:

  1. Open an pybedtools issue on this (https://github.com/daler/pybedtools)
  2. In that new issue, provide example input files (vf, af) that reproduce this behavior.

Thanks!

ShixiangWang commented 7 years ago

I did not know that, thanks for your help.


发件人: Ryan Dale [notifications@github.com] 发送时间: 2017年4月4日 21:44 收件人: arq5x/bedtools2 抄送: 王诗翔; Mention 主题: Re: [arq5x/bedtools2] Pybedtools------IndexError: list index out of range (#515)

@ShixiangWanghttps://github.com/ShixiangWang pybedtools is actually maintained elsewhere. Can you please:

  1. Open an pybedtools issue on this (https://github.com/daler/pybedtools)
  2. In that new issue, provide example input files (vf, af) that reproduce this behavior.

Thanks!

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/arq5x/bedtools2/issues/515#issuecomment-291503869, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AX5Y5FAqJESySpyzp2_4U1NYr1MTNvZXks5rskk1gaJpZM4MyaOT.