jorahn / icy

data wrangling glue code
http://www.rcs-analytics.com/icy/index.html
MIT License
0 stars 0 forks source link

custom csv format #3

Open jorahn opened 9 years ago

jorahn commented 9 years ago

add interface for custom/manually compressed csvs like

example: see https://www.kaggle.com/c/icdm-2015-drawbridge-cross-device-connections/data read csv's like

id,{(key1,key2)}
1,{(a,4),(b,5)}

into

id,key1,key2
1,a,4
1,b,5

using something like

df = pd.DataFrame(columns=re.sub(r'[{}()\n]', '', file.readline()).split(','))
for line in file:
  con, rot = line.split('{', 1)
  rot = re.findall(r'\((.*?)\)', rota[:-2])
  for r in rot: df.append((const+r).split(','))

interface idea: expand_on='{()}' argument