to_dataframe() creates 0th row with generic names in nucleotide_content

daler / pybedtools

Python wrapper -- and more -- for BEDTools (bioinformatics tools for "genome arithmetic")

http://daler.github.io/pybedtools

Other

297 stars 103 forks source link

to_dataframe() creates 0th row with generic names in nucleotide_content #385

Closed mheskett closed 1 year ago

mheskett commented 1 year ago

Using nucleotide content and to_dataframe() will give you this dataframe with the 0th row being the nucleotide content output names. This makes the dataframe unusable and you have to manually delete the 0th row.

@           0          1          2         3         4        5        6        7        8         9           10          11
0    #1_usercol  2_usercol  3_usercol  4_pct_at  5_pct_gc  6_num_A  7_num_C  8_num_G  9_num_T  10_num_N  11_num_oth  12_seq_len
1             1          0      25000  0.256160  0.343840     3312     4440     4156     3092     10000           0       25000

DevangThakkar commented 1 year ago

I believe this was brought up before in https://github.com/daler/pybedtools/issues/258 and addressed in https://github.com/daler/pybedtools/pull/264. The solution is to set disable_auto_names=True in the to_dataframe() call.

daler commented 1 year ago

Yes, and thanks @DevangThakkar for pointing that out!