seermedical / seer-py

Python SDK for the Seer data platform
MIT License
27 stars 10 forks source link

data format issues #143

Closed kmakaram closed 3 years ago

kmakaram commented 3 years ago

The data provided by "My Seizure Gauge Data" is not in correct format. eg. 'MSEL_01575' - checked heart rate signal using code based on "Example.ipynb". This resulted data in the order of 1.0 e+11

Secondly, the "msg_data_downloader.py" results in multiple parquet files. Providing an example code to processes these files would be beneficial.

EwanNurse commented 3 years ago

The data provided by "My Seizure Gauge Data" is not in correct format. eg. 'MSEL_01575' - checked heart rate signal using code based on "Example.ipynb". This resulted data in the order of 1.0 e+11

Secondly, the "msg_data_downloader.py" results in multiple parquet files. Providing an example code to processes these files would be beneficial.

Hi kmakaram, thanks very much for getting in touch.

We're aware of the issue of the data scaling and are looking for a solution, as you can tell there's been an issue with the scaling of the data values.

We have split in to multiple parquet files as this was felt to me most useful for users to manage longer data files. We hope the files have sufficient timing information for you to combine them yourself, but will certainly add this as an option in future releases.

kmakaram commented 3 years ago

Does a linear mapping of these value to the signal range (as given in the metadata) lead to good results?

dkeden commented 3 years ago

Hi @kmakaram,

If you are referring to HR signal values, you should be able to just divide by the exponent (as provided in the metadata). Unfortunately some data was uploaded with an exponent of 10^9 by mistake.

Let me know how you go, and if you have any further issues.