anthony-walker / mdpi-swept-paper-2021

1 stars 0 forks source link

Reviewer 1 comments #1

Open anthony-walker opened 3 years ago

anthony-walker commented 3 years ago

This paper shows the implementation and testing of a 2D CPU-GPU solver for PDE with the swept rule. The testing results indicate that speedup (>1) could only be achieved by careful selections of configuration parameters for different equations, scales and hardware. The logic is very clear but I have some questions as well as some suggestions:

Questions:

  • [x] Q1: In line 168, the paper mentions “largest block size”. Does it mean the sizes of blocks are different for a particular case? As far as I know, when a kernel is launched in CUDA, all blocks have the same size.
  • [x] Q2: In line 184, where does the “32 processes” come from? Could you list more information for the 2 sets of hardware? E.g. number of physical CPU and GPU card on each node
  • [x] Q3: In section 3.3 Swept Solution Process, figure 1 and 2 give an overview of each sub-steps, but they are not enough clear for readers who are not familiar with the swept technique. Could you please add more explanation (pseudo-code, diagram or a sketch of a simple example with depth equal to 3 time steps), especially on X-Bridge, Y-Bridge? Basically, it is better to show the input/output of each process. This will make this paper more self-contained and understandable for more readers.
  • [x] Q4: In line 290, could you explain more on the necessity of “writting to disk” within each calculation step? Because hard disk is very slow compared to memory, it is usually only used during initialization or post-processing stages.
  • [ ] Q5: From the Figure 4/5/7/8, there is no clear trend of the benefit of share or block size for the best/worst performance. Could you explain why?
  • [x] Q6: In line 435, could you list the situations in which the GPUs cannot solely be used?

Suggestions:

anthony-walker commented 3 years ago

Make subfigure and do standard process first, Add more in captions about shapes, comm, and etc. Add 1D figure to help explanation triangle and communication