Open murometz opened 6 years ago
thank you. that is true. the dataset format is based on Multivariate Time Series Classification Datasets .
The first column is the sample id, the second column is the time stamp of the observation, the third column is the label for the sample (it may not change for the same id), the last columns are the observations (different dimensions of the multivariate time series).
An example:
Sample Id | Time Stamp | Class | Pressure | Temperature | Energy |
---|---|---|---|---|---|
1 | 1 | 1 | 2.70 | 80.50 | 4.50 |
1 | 2 | 1 | 3.20 | 78.40 | 6.70 |
1 | 3 | 1 | 4.20 | 67.90 | 3.40 |
1 | 4 | 1 | 8.20 | 89.50 | 7.20 |
1 | 5 | 1 | 8.90 | 85.70 | 5.70 |
2 | 1 | 3 | 16.34 | 97.54 | 5.02 |
2 | 2 | 3 | 17.61 | 99.66 | 5.01 |
2 | 3 | 3 | 18.87 | 101.60 | 4.90 |
2 | 4 | 3 | 20.14 | 103.54 | 4.95 |
2 | 5 | 3 | 22.67 | 107.43 | 4.95 |
2 | 6 | 3 | 21.15 | 106.50 | 4.97 |
.. | .. | .. | .. | .. | .. |
N | 1 | 0 | 8.90 | 85.70 | 5.70 |
N | 2 | 0 | 10.01 | 88.00 | 5.05 |
N | 3 | 0 | 11.28 | 89.94 | 5.04 |
Hi Patrick
Thank you very much for the fast replay, it helps a lot!
The time series index is just a temporal order of events?
Timeseries id - e.g. different sensors, right?
Thanks again. Best regards Ilja
Hi Ilja,
yes, the time index aka time stamp is the temporal order of the events.
No, the time series id is the sample id. Sample 1 could be Berlin, sample 2 could be Paris and sample n is London. Each one has 3 sensors for temperature, pressure and energy.
So, there is no explicit sensor id. it is implicitly coded in the last columns.
Hi Patrick
Great, thanks a lot.
Regards Ilja
Hi Patrick
the third column is the label for the sample (it may not change for the same id)
This means that I can't have different classes for one sample ID?
If I want to detect different activities with several sensors sets, which are installed on different locations, I would have per location (with its sample ID) different classes.
I have one sample as of now, but it contains different classes.
How the data should be constructed in this case?
Thank you very much for your time!
Best regards Ilja
I am not sure what you mean by "I have one sample as of now, but it contains different classes." Do you mean one person performing different activities?
A sample can be though of a single recorded activity, similar to a primary key in a database. So for example: Sample 1: Person A jumps. Sample 2: Person A sits. Sample 3: Person A eats. Sample 4: Person A walks.
Here we have a single person doing multiple activities. Each sample can then have multiple sensors attached to it like wrist, finger, arm, etc.
But we could as well have different persons (A,B,C) doing different activities:
Sample 1: Person A jumps. Sample 2: Person A sits. Sample 3: Person A eats. Sample 4: Person A walks. Sample 5: Person B jumps. Sample 6: Person C jumps.
Hi Patrick Thank you. I have a set of sensors which is installed in one apartment. This sensor set records different activities, as you mentioned. I already have these activity classes assigned to different timesteps (from protocol) and want to train model to detect these activities just from sensor data. I also would like to know whether the person is doing something which can be regarded as anomaly. The entire record is not separated in samples.
I see. This sounds like a multi-label classification problem?
Unfortunately, MUSE does not support this kind of application, yet. Are you able to share this data in some way? I would be interested to look into it, though I can not guarantee how fast I will be able to do so.
Hallo
Thank you for your fascinating work! Which structure have the datasets in the datasets folder? They don't look like original UCI datasets... What are the columns?
For example, DigitShapeRandom: 1 1 1 0.3421972205305417 -1.594004942648406 1 2 1 0.3490627644881473 -1.4250704156116172 1 3 1 0.3353316765729366 -1.2647991976536381 1 4 1 0.3559283084457524 -1.0915330160774444 1 5 1 0.3559283084457524 -0.9225984890406556 1 6 1 0.35249553646694987 -0.7709905801614865 1 7 1 0.3490627644881473 -0.5847294349670782 1 8 1 0.37309216833976566 -0.4634431078637429 1 9 1 0.37309216833976566 -0.35515174437862146
First column seems to be the class. And the rest?
Thank you very much for your time.
Regards Ilja