Closed victordaniel closed 4 years ago
For Twitter datasets, we only have a few ground truth events so we cannot directly assign a 0/1 label to all the tweets. However, when anomaly detection algorithms are run on such datasets, we can observe that peaks (tweets aggregated per hour/day/week) correspond well to ground truth events. If you still need to generate labels, you can try and label tweets containing goal/penalty/injury kind of events as anomalous (label = 1)
Thanks!
" label tweets containing goal/penalty/injury kind of events as anomalous (label = 1)"
does it mean that whichever tweets that contain events should be considered as anomalous ,and those tweets which do not contain events as nonanomalous?
do you suggest any other twitter or Facebook dataset for edge stream anomaly detection?
Yes, I meant that. Sorry, I am not aware of any other datasets from Twitter/Facebook.
I want to run MIDAS on the TwitterWorldCup2014 dataset, but in the given dataset, the ground truth does not include the label as 0 or 1, instead, it shows the following
1 | Arena de Sao Paulo, Sao Paulo, Brazil | Brazil, Croatia | Marcelo | Own Goal | 6-12-2014 20:11:00 | High importance events.
please suggest, how to generate labels as 0 or 1 i.e anomalous or not. Have you already prepared ground truth labels for this, if yes could you please share that?
Here in this dataset , there are three events such as
Thanks.