tbenthompson / cutde

Python CPU and GPU accelerated TDEs, over 100 million TDEs per second!
MIT License
58 stars 15 forks source link

Halfspace matrix always returning Nan #25

Closed hvasbath closed 1 year ago

hvasbath commented 1 year ago

Dear Ben,

first of all thanks A LOT for putting together this great package and the invaluable BIE Tutorials! These are really helpful!!!

Now I intended to start assembling my own setup where I only wanted to stick to the HalfSpace solutions as topography is not an issue currently. However, I do not seem to manage to be able to get the halfspace solution to work. The disp_matrix function always just returns no matter what. I think I am simply making a stupid mistake somewhere.

I was using your example.py code to adjust it a little to use the halfspace solutions and put the triangles a little below the surface, but there seems to be something else not working.

Can you please also elaborate a little on the design decision why the number of triangles and observations in the "disp" functions need to be identical? In order to calculate the displacement at surface for 1000 points due to 10 triangles I would need to repeat my triangle indices 1000 times-right?

I appreciate any hint/suggestion very much! Cheers! Hannes

import matplotlib.pyplot as plt
import numpy as np

import cutde.halfspace as HS

from time import time

ft = "float32"

xs = np.linspace(-2, 2, 200)
ys = np.linspace(-2, 2, 200)
obsx, obsy = np.meshgrid(xs, ys)
pts = np.array([obsx, obsy, 0 * obsy]).reshape((3, -1)).T.copy().astype(ft)

fault_pts = np.array([[-1, 0, -1], [1, 0, -1], [1, 0, -2], [-1, 0, -2]]).astype(ft)
fault_tris = np.array([[0, 1, 2], [0, 2, 3]], dtype=np.int64)
print(fault_pts[fault_tris].shape)
t0 = time()
disp_mat = HS.disp_matrix(obs_pts=pts, tris=fault_pts[fault_tris], nu=0.25)

t1 = time()
print(t1 - t0)

slip = np.array([[1, 0, 0], [1, 0, 0]])
disp = disp_mat.reshape((-1, 6)).dot(slip.flatten())

disp_grid = disp.reshape((*obsx.shape, 3))

plt.figure(figsize=(5, 5), dpi=300)
cntf = plt.contourf(obsx, obsy, disp_grid[:, :, 0], levels=21)
plt.contour(
    obsx,
    obsy,
    disp_grid[:, :, 0],
    colors="k",
    linestyles="-",
    linewidths=0.5,
    levels=21,
)
plt.colorbar(cntf)
plt.title("$u_x$")
plt.tight_layout()
plt.savefig("example.png", bbox_inches="tight")
tbenthompson commented 1 year ago

Unfortunately, I think you've run across a bug with the half-space code. I'm pretty sad about this! But I guess the half-space code got a lot less testing than the full-space code!

I was able to solve the problem by reversing the order of the second triangle:

fault_tris = np.array([[0, 1, 2], [0, 3, 2]], dtype=np.int64)

That doesn't make me feel good and might suggest a place to start debugging.

On the other hand, I was able to cause much worse failures by adding a small number to the y-coord of point 4. In this situation, the results are no longer NaN and instead are large and incorrect.

fault_pts = np.array([[-1, 0, -1], [1, 0, -1], [1, 0, -2], [-1, 1e-3, -2]]).astype(ft)
fault_tris = np.array([[0, 1, 2], [0, 2, 3]], dtype=np.int64)

Are you interested in helping out debug this? If you want to use cutde.halfspace soon, that'll be faster than waiting for me to find the time to dig in.

My recommendation:

I'm going to also give you full write permissions on the repo in case you want to put up a PR and merge it yourself.

hvasbath commented 1 year ago

Thanks a lot for these directions! Thats very helpful! I will definitely try to debug it as soon as I find the time. Again this repo is great and bugs here and there can happen to anyone ;) .

tbenthompson commented 1 year ago

Hey! I got an email from this thread about printing a vector. I'm guessing you deleted the comment since it's not here anymore. Did you figure it out? A couple options:

std::cout << vec[0] << std::endl; 

will print the first element.

for (int i = 0; i < 5; i++) {
    std::cout << vec[i] << " ";
}
std::cout << std::endl;

will print the first five elements separated by spaces.

Feel free to ask more questions! You're welcome to email me or comment here. Whatever is best for you. I'm glad you're diving in and making progress. That's awesome!!!

hvasbath commented 1 year ago

Yes I figured it out ;) . Thanks! I also found the problem: here: https://github.com/tbenthompson/cutde/blob/master/cutde/common.cu#L429 and here: https://github.com/tbenthompson/cutde/blob/master/cutde/common.cu#L438

These two terms become Nan, due to division by zero coming from cosB being zero as the cos(beta) when beta = pi/2. This works in Matlab as matlab returns a very low value (6.1232e-17) instead of a proper zero, that is returned in your implementation.

I need your advice here- what a proper fix would be. Does increase in number precission help here? But on the GPU we are limited to float32 to be efficient, aren`t we?

I also contacted Mehdi, what his advice would be.

tbenthompson commented 1 year ago

Woowww!! Thanks for debugging. This is super great!

Questions:

If Matlab does return the correct answer for this situation, then I guess the mathematical expression is robust to that kind of division by a super small number? Then maybe the right thing to do is to copy Matlab by checking for a identical zero in beta and then setting cosB to a very small number (1e-15?). I'm not completely sure here and I would want some tests to rigorously check against Matlab

There's an existing automated test here that loops over a whole bunch of randomly generated triangles and points and confirms the results match Matlab. It would be nice to also add an additional test to that file that checks this beta=0 situation if you're up for that! But, I'm also happy to do that myself after you've figured out the fix for this issue since I know how to do it quickly and it'll be a three minute job for me.

hvasbath commented 1 year ago

Indeed that was how I found it. I setup an identical test in Matlab comparing numbers that matlab produces with cutde. To my understanding matlab returns the correct number. That code is extensively tested and used since 8 years. The next week I won't be able to continue here. But for sure will do so after that. Will keep you posted.

hvasbath commented 1 year ago

For now did as you suggested. PR here: https://github.com/tbenthompson/cutde/pull/26

I found something else where I got suspicious: https://github.com/tbenthompson/cutde/blob/master/cutde/common.cu#L752 In my understanding I would think this should be set ${disp_fs("image_tri", "true")} instead of ${disp_fs("image_tri")}

Can you comment?

tbenthompson commented 1 year ago

I found something else where I got suspicious: https://github.com/tbenthompson/cutde/blob/master/cutde/common.cu#L752 In my understanding I would think this should be set ${disp_fs("image_tri", "true")} instead of ${disp_fs("image_tri")}

Can you comment?

I think you are correct. That does seem wrong. If I follow through the code, the result is that some strike vector negation here is not happening. If you make the ${disp_fs("image_tri", "true")} fix, does that affect the tests at all? Should it? I'd want to make sure that we aren't compensating for this bug elsewhere in the code if you fix it.

Thanks for finding these issues! This is wonderful!

tbenthompson commented 1 year ago

BTW, I haven't made a PyPI release yet after the PR you wrote so if you install with pip install cutde or the same with conda, you'll still get the old version. Let's wait to do a release until this other bug gets resolved.

hvasbath commented 1 year ago

Ok thats only important for an exact horizontal dislocation. I modified the test and it is still working although it shouldnt. I dont understand why, but although "is_halfspace" is false it always passes this if statement here: https://github.com/tbenthompson/cutde/blob/master/cutde/common.cu#L692 Can you give an explanation and how to fix this? The web says it should work like you implemented it ... but it always passes the if....

Such that it still works, but just because the point Z value is > 0. I guess this should become a problem for fullspace solutions then when Z is above 0. Will setup such a test as well...

hvasbath commented 1 year ago

Ok nevermind the previous. New PR here: https://github.com/tbenthompson/cutde/pull/27 Now I think it is ok. At least I havent found anything else ;) . Can you please explain the % if .... % endif construction? I couldnt find anything about what the % before those does.

tbenthompson commented 1 year ago

I think the problem is with the line here: https://github.com/tbenthompson/cutde/blob/bf7aa3eecd673224c3bd73688d3428fcd5f9a84f/cutde/common.cu#L602

if is_halfspace: evaluates to True because is_halfspace is the string 'false' rather than the boolean False.

So, I think that line should be changed to:

<%def name="disp_fs(tri_prefix, is_halfspace=False)"> 
tbenthompson commented 1 year ago

Oh, I just saw your new comment right after I hit send on mine. =)

tbenthompson commented 1 year ago

The % if and % endif and such are from the Mako templating. We're using Mako here to essentially do a bunch of code generation: https://github.com/tbenthompson/cutde/blob/bf7aa3eecd673224c3bd73688d3428fcd5f9a84f/cutde/gpu_backend.py#L57-L65

Here's the docs for Mako: https://www.makotemplates.org