When the entry_line parameter in the GEOparse.__parse_entry() function contains more than an equal sign, the current function raises an exception(GEOparse.GEOTypes.DataIncompatibilityException) as follows:
gpl = GEOparse.get_GEO(geo="GPL6101", silent=True, include_data=True, destdir=".")
(<class 'GEOparse.GEOTypes.DataIncompatibilityException'>, DataIncompatibilityException('\nData columns do not match columns description index in GSM665713\nColumns in table are: ID_REF, VALUE, T0-S(0)-2=S(15).Detection Pval\nIndex in columns are: ID_REF, VALUE, T0-S(0)-2\n',), <traceback object at 0x7f57f4dd9688>)
The line causing the above exception is as follows:
#T0-S(0)-2=S(15).Detection Pval =
columns variable taken from GEOparse.parse_columns(soft) looks like as follows:
Thus, I suggest to modify the GEOparse.__parse_entry() function as follows :
def __parse_entry(entry_line):
"""Parse the SOFT file entry name line that starts with '^', '!' or '#'.
Args:
entry_line (:obj:`str`): Line from SOFT to be parsed.
Returns:
:obj:`2-tuple`: Type of entry, value of entry.
"""
if entry_line.startswith("!"):
entry_line = sub(r"!\w*?_", "", entry_line)
else:
entry_line = entry_line.strip()[1:]
n_equal_sign = entry_line.count("=")
try:
if 1 == n_equal_sign:
entry_type, entry_name = [i.strip() for i in entry_line.split("=", maxsplit=1)]
else:
entry_type, entry_name = [i.strip() for i in split(" = ?", entry_line, maxsplit=1)]
except ValueError:
if 1 == n_equal_sign:
entry_type = [i.strip() for i in entry_line.split("=", maxsplit=1)][0]
else:
entry_type = [i.strip() for i in split(" = ?", entry_line, maxsplit=1)][0]
entry_name = ""
return entry_type, entry_name
When the
entry_line
parameter in theGEOparse.__parse_entry()
function contains more than an equal sign, the current function raises an exception(GEOparse.GEOTypes.DataIncompatibilityException) as follows:The line causing the above exception is as follows:
#T0-S(0)-2=S(15).Detection Pval =
columns variable
taken fromGEOparse.parse_columns(soft)
looks like as follows:Index(['ID_REF', 'VALUE', 'T0-S(0)-2'], dtype='object')
Meanwhile,
GEOparse.parse_table_data(soft)
correctly parsed the SOFT data as follows:Index(['ID_REF', 'VALUE', 'T0-S(0)-2=S(15).Detection Pval'], dtype='object')
Thus, I suggest to modify the
GEOparse.__parse_entry()
function as follows :