usc-isi-i2 / dig-etl-engine

Download DIG to run on your laptop or server.
http://usc-isi-i2.github.io/dig/
MIT License
101 stars 39 forks source link

Timeseries annotation: Can't fetch data from one sheet when xlsx file contain multiple sheets #185

Closed SukritiSharma closed 6 years ago

SukritiSharma commented 6 years ago

Currently to extract data from extractSpreedsheet.py for time series data, we need to specify annotations for for all sheets in case of multiple sheets in single xlsx file. I couldn't fetch data from my targeted sheet as it tries to apply my annotation all the sheet.

We may need a feature where we can specify sheet name or sheet index number to extract data in case of multiple sheet.

puuj commented 6 years ago

Please clarify. If you only annotated sheet 1, the others will be ignored (this works in the ELICIT data sources). If you want to skip sheet 1 and only annotated sheet 2, I think I can make it ignore a blank annotation for sheet 1 and have it work.

SukritiSharma commented 6 years ago

The problem I faced with my data set was, I had 21 sheets inside my xlsx file, and data which was relevant to me was in the sheet number 10 only. If I left the initial 9 annotation blank {}, it gave me key error.

So I think 2nd case mentioned by would be relevant to us. @saggu can elaborate more how in the long run we may want that feature.

puuj commented 6 years ago

new version allows you to specify sheet_indices for each annotation