I am trying to clean an SBtab file by converting each sbtable to a dataframe, clean the dataframe, and then convert back to an sbtable. There is an error in the from_data_frame() function. On line 934, the sbtab columns are assigned from the dataframe columns without adding a ! character in front of each one.
sbtab.columns = df.columns.tolist() should be sbtab.columns = ['!' + x for x in df.columns.tolist()]
Here's an example of some code that could be used to test this
def df_conversion_check(sbtab):
"""check conversion between dataframe and sbtab table"""
df = sbtab.to_data_frame()
return SBtab.SBtabTable.from_data_frame(df, sbtab.table_id, sbtab.table_type, sbtab.table_name)
sbtable_before = s.sbtabs[0] # get an sbtable object from somewhere
display(sbtable_before.to_data_frame())
sbtable_after = df_conversion_check(sbtable_before)
display(sbtable_after.to_data_frame())
This is indeed a problem with the to_data_frame() function, but the solution is not so simple. the "!" sign should only be added in some cases (when the column header is in a list of pre-defined values).
I am trying to clean an SBtab file by converting each sbtable to a dataframe, clean the dataframe, and then convert back to an sbtable. There is an error in the from_data_frame() function. On line 934, the sbtab columns are assigned from the dataframe columns without adding a
!
character in front of each one.sbtab.columns = df.columns.tolist()
should besbtab.columns = ['!' + x for x in df.columns.tolist()]
Here's an example of some code that could be used to test this