jorahn / icy

data wrangling glue code
http://www.rcs-analytics.com/icy/index.html
MIT License
0 stars 0 forks source link

duplicate keys #2

Open jorahn opened 8 years ago

jorahn commented 8 years ago

make sure dictionary keys are not overwritten by duplicates

example:
path = folder with two differently named zip-files (a.zip, b.zip)
containing one csv-file each with the same filename (data.csv)
=> dict = { 'data.csv' : pd.DataFrame }

also keep keys as short and simple as possible!

maybe check if new key in dict and enumerate equal keys (data.csv_0, data.csv_1 ...)

jorahn commented 8 years ago

occurs in babynames example. preliminary fix: disabled shortening the key if only one element in path to the name of the element. instead the key is sourcefile_element.

in the above example, this would result in
=> dict = {'a.zip_data.csv': df1, 'b.zip_data.csv': df2}