intellistream / Sesame

[SIGMOD 2023] Data Stream Clustering: An In-depth Empirical Study
MIT License
17 stars 6 forks source link

CluStream out_of_range with EDS input #154

Closed wzru closed 2 years ago

wzru commented 2 years ago
TEST(SystemTest, CluStream) {
  // Setup Logs.
  setupLogging("benchmark.log", LOG_DEBUG);
  // [529, 999, 1270, 1624, 2001, 2435, 2648, 3000]
  // [3, 3, 4, 6, 6, 7, 9, 9]
  // Parse parameters.
  param_t cmd_params;
  cmd_params.num_points = 100000;
  cmd_params.dim = 2;
  cmd_params.num_clusters = 90;
  cmd_params.num_last_arr = 2;
  cmd_params.time_window = 200;
  cmd_params.time_interval = 100;
  cmd_params.num_online_clusters = 1000;
  cmd_params.radius = 10;
  cmd_params.buf_size = 1500;
  cmd_params.offline_time_window = 0;
  cmd_params.seed = 10;
  cmd_params.time_decay = false;
  cmd_params.input_file = std::filesystem::current_path().generic_string() +
                          "/datasets/EDS.txt";

output:

Algorithm: CluStream num_last_arr: 2   time_window: 200   num_offline_clusters: 90   ClusterNumber: 1000   radius: 10   buf_size: 1500
DataSource spawn thread=0
Engine spawn thread=1
DataSink spawn thread=2
DataSink start to grab data
Algorithm start to process data
DataSource start to emit data
KMeans++ start!!!
KMeans++ sourceEnd!!!
sourceEnd set to true
ready to process remaining data
ready to offline clustering
KMeans++ start!!!
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 27) >= this->size() (which is 27)
Aborted
wzru commented 2 years ago

num_online_clusters should be greater than num_clusters buf_size should be greater than num_online_clusters