fmihpc / vlasiator

Vlasiator - ten letters you can count on
https://www.helsinki.fi/en/researchgroups/vlasiator
Other
47 stars 38 forks source link

Separating distribution populations #16

Open ohannuks opened 10 years ago

ohannuks commented 10 years ago

There was a branch for separating populations from each other but the algorithm was too slow so it was forgotten. I'm reopening the issue and I'd suggest using thrust library (or other) for implementing parallel sorting for it: https://github.com/thrust/thrust http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html

ohannuks commented 10 years ago

First task before starting to work on this is to evaluate how fast the algorithm would be by implementing the heaviest operations that will be used (mainly parallel sort, which is of complexity O(nlogn/t), where n = number of blocks/velocity cells and t is the number of threads.

ohannuks commented 10 years ago

Another idea is to use parallelization over spatial cells (and before doing that calculate roughly equal amount of blocks for every thread). Anyhow, I'll change the algorithm from the previous one.

galfthan commented 10 years ago

Could you first of all list the algorithms here, with references. We have had in mind a number of algorithms. Also, I think that adding the parallel sort is something that makes sense to do only once the algorithm works. I am not too keen either to add yet another dependency...

ohannuks commented 10 years ago

Pseudocode (roughly)

// Clustering algorithm based on volume
clustering_volume
   // Neighbors(v) fetches all velocity neighbors, complexity O(m) = O(1), where m=26 is number of neighbors
   // Note: According to initial tests with 50x50x50 velocity cells neighbors(v) is by far the heaviest operation in  this algorithm
   // sorted velocity_cells : complexity O(nlogn)
   velocity_cells = [4.4e-10, 4e-10, .. , 2e-16, 1e-16] // avgs values
   //for loop, complexity O(n * m) = O(n)
   for v in velocity_cells:
      // Neighbors(v) fetches all velocity neighbors of v
      if any neighbors(v) in a cluster:
     if one neighbor neighbor_1:
        add v to neighbor (neighbor_1's cluster)
     else if more than one neighbors neighbor_1 and neighbor_2:
        if (neighbor_1's cluster) or (neighbor_2's cluster) is small enough:
           merge (neighbor_1's cluster) and (neighbor_2's cluster)
        add v to neighbor_1's cluster
      else:
     // v does not have neighbors that are a part of a cluster
     v = new cluster

// Clustering algorithm based on threshold
clustering_threshold
   // Neighbors(v) fetches all velocity neighbors
   // Unsorted velocity_cells:
   velocity_cells = [4.4e-10, 2e-16, .. , 4e-10, 1e-16] // avgs values
   v = max(velocity_cells)
   remove v from velocity_cells
   threshold = 1e-16
   // Vector for holding the unprocessed velocity cells at the cluster border, for example if the cluster looked spherical with radius R then the velocity cells at the cluster border would be the velocity cells sitting at the radius R, furthest away from the center of the cluster. This is just so we know which velocity cells to process next
   unprocessed_velocity_cells = [ v ]
   // For loop, complexity O(n * m + n * c) = O(n), where m = 26 is the number of neighbors for every velocity cell and c is the number of clusters
   while true:
      if size( unprocessed_velocity_cells ) == 0:
     // max complexity: O(n)
     if max( velocity_cells ) <= threshold:
        // The algorithm is all done
        break
     else:
        // Create new cluster
        new cluster
        unprocessed_velocity_cells = max( velocity_cells )
        get v from unprocessed_velocity_cells
      else:
     get v from unprocessed_velocity_cells

      // O(m) = O(1) complexity
      for neighbor in neighbors(v):
     // Check if neighbor passes the threshold condition and that it is not yet processed O(1)
     if neighbor > threshold and neighbor in velocity_cells:
        add neighbor to latest cluster
        add neighbor to unprocessed_velocity_cells
        remove neighbor from velocity_cells
      remove v from unprocessed_velocity_cells