Closed RahimKh closed 3 years ago
Hello, @RahimKh !
Thank you for your remarks! Cause we are still working on the methodology for algorithms evaluation, your comment is helpful.
We do see 3 possible ways of model fitting (fault-free train set selection):
We have selected the 3rd way, for now, using the first 400 points of each dataset (approx 1/3 of the total number of points) as a train set. It is not entirely fair (doing so, we decrease the number of unknown points, making the problem easier to solve), but still, it is ok for a changepoint detection problem. As for the outlier detection problem: though generally, you are right saying "results need to be done on only data that is unknown to the model" for metrics (FAR, MAR, F1) calculation, it can still be an option. In this case, the results are just slightly overstated.
We definitely want to switch to the 1st way of model fitting. Probably we will switch to the 2nd way while the proper separate fault-free dataset is unavailable.
The answer moved to the slides about the project.
Hello,
I have some concerns regarding the notebooks provided. Why is the training made with files having anomalies instead of the free anomaly csv file and then testing on the other files? Also the results are taking in count the whole file including the training samples but I know that results need to be done on only data that is unknown to the model. Am I missing something? Thank you.