ForestClaw / forestclaw

Quadtree/octree adaptive PDE solver based based on p4est.
http://www.forestclaw.org
BSD 2-Clause "Simplified" License
57 stars 21 forks source link

Speed up transform #255

Closed scottaiton closed 1 year ago

scottaiton commented 1 year ago

I looked into the problem @jbsnively was having with ghost filling and found that a lot of time was being spent in the fclaw2d_patch_transform functions. The problem was that fclaw2d_clawpatch_get_options was being called on each ghost cell. Since we switched to using a key-value pair for storing options in the glob which is more expensive way of storing things, this caused the slowdown.

The solution was to just store the clwpatch options in transform_data->user to avoid calling fclaw2d_clawpatch_get_options on each ghost cell.

The original ghost filling, before we were storing options in a key-value pair was:

[libsc] Statistics for   GHOSTFILL_COPY
[libsc]    Global number of values:       8
[libsc]    Mean value (std. dev.):           17.4065 (2 = 11.5%)
[libsc]    Minimum attained at rank       0: 14.2407
[libsc]    Maximum attained at rank       2: 20.1298

With the change to storing the options in a key-value pair the ghost filling increases significantly:

[libsc] Statistics for   GHOSTFILL_COPY
[libsc]    Global number of values:       8
[libsc]    Mean value (std. dev.):           38.6166 (4.51 = 11.7%)
[libsc]    Minimum attained at rank       7: 31.345
[libsc]    Maximum attained at rank       2: 44.3934

The new solution of storing the options pointer in transform_data->user:

[libsc] Statistics for   GHOSTFILL_COPY
[libsc]    Global number of values:       8
[libsc]    Mean value (std. dev.):           11.7372 (1.38 = 11.8%)
[libsc]    Minimum attained at rank       0: 9.73282
[libsc]    Maximum attained at rank       2: 13.748

So this should be even faster than the original code.

cburstedde commented 1 year ago

This is a really good catch!

donnaaboise commented 1 year ago

Fantastic!! Thanks, Scott.