diffpy / diffpy.fourigui

Other
0 stars 7 forks source link

Fix for RuntimeWarning #38

Closed cadenmyers13 closed 1 month ago

cadenmyers13 commented 1 month ago

Added this condition to fix the RuntimeWarning from pytest. The other warnings seem to be a bit more involved. Do we want to address this?

cadenmyers13 commented 1 month ago

I can only give reasonable feedback if you copy-paste the warning/error in the comment, as I don't really want to download and reproduce the error unless absolutely necessary. Could you do that? Then I/Bob/Tieqiong can give feedback

Here are those warning messages:

Screenshot 2024-10-24 at 6 06 30 PM
sbillinge commented 1 month ago

@cadenmyers13 it is probably a good idea to put the toml edits on this PR too, so that tests can pass when they are ready to pass.

cadenmyers13 commented 1 month ago

so the correct way to address this warning is to understand what the test is testing and why the warning is generated. It is not clear to me whether the test should be fixed or the code, but I am guessing it might be the test, not the code.

To my understanding, this warning occurs because some of the test data contains NaN values for an entire slice and the function causing this warning is trying to find max and min of that slice. Since only NaN values are in that slice, we get the warning. I ran fourigui (using the data containing a NaN slice) with and without the edits made on this PR and it seems to operate fine in both cases. I think we can get away without fixing this warning if there's concern about the behavior of the code.

sbillinge commented 1 month ago

so the correct way to address this warning is to understand what the test is testing and why the warning is generated. It is not clear to me whether the test should be fixed or the code, but I am guessing it might be the test, not the code.

To my understanding, this warning occurs because some of the test data contains NaN values for an entire slice and the function causing this warning is trying to find max and min of that slice. Since only NaN values are in that slice, we get the warning. I ran fourigui (using the data containing a NaN slice) with and without the edits made on this PR and it seems to operate fine in both cases. I think we can get away without fixing this warning if there's concern about the behavior of the code.

I am not sure you are quite understanding my point. We want to fix the warning but the question is how to fix it. You fixed it by modifying the code which is appropriate is the test is good. But if the problem is the test then we don't want to fix code to make a bad test pass because it makes the code worse, not better, which is the purpose of the tests. So I am asking you to investigate what is the goal of that test, and does it accomplish it's goal? Then we can decide how to fix it.

Tests encode behavior, so the investigation should be about behavior, so not include sentences like "some test data contains all NaN slices" etc. What behavior is the test testing?

cadenmyers13 commented 1 month ago

so the correct way to address this warning is to understand what the test is testing and why the warning is generated. It is not clear to me whether the test should be fixed or the code, but I am guessing it might be the test, not the code.

To my understanding, this warning occurs because some of the test data contains NaN values for an entire slice and the function causing this warning is trying to find max and min of that slice. Since only NaN values are in that slice, we get the warning. I ran fourigui (using the data containing a NaN slice) with and without the edits made on this PR and it seems to operate fine in both cases. I think we can get away without fixing this warning if there's concern about the behavior of the code.

I am not sure you are quite understanding my point. We want to fix the warning but the question is how to fix it. You fixed it by modifying the code which is appropriate is the test is good. But if the problem is the test then we don't want to fix code to make a bad test pass because it makes the code worse, not better, which is the purpose of the tests. So I am asking you to investigate what is the goal of that test, and does it accomplish it's goal? Then we can decide how to fix it.

Tests encode behavior, so the investigation should be about behavior, so not include sentences like "some test data contains all NaN slices" etc. What behavior is the test testing?

After spending some time I think I'm understanding this better. In the source code, np.nanmax() and np.nanmin() raises the RuntimeWarning and returns NaN when the array passed through is all NaN (see ss of numpy API). So, when the test runs (see ss of test function), RuntimeWarning is returned (np.allclose() is comparing NaN to an array of zeros). I think this could now be handled in two ways:

1) Set NaN values to zero before it is passed through np.nanmax/np.nanmin. This would be changing the source code slightly.

2) Change the test to suppress the RuntimeWarning.

Therefore, I dont think the current edits on this PR would be a good fix. Based off what I've read about RuntimeWarning, it seems that option 2 might be the best. In option 1, we run the risk of slightly changing the functionality of the code. I've made a PR with option 2 edits. If you think that would be best, we can close this PR.

Screenshot 2024-10-27 at 5 39 02 PM Screenshot 2024-10-27 at 5 36 12 PM
bobleesj commented 1 month ago

Just swinging by - I am wondering why the test inputs all contain "NaN values for an entire slice" in the first place? Just not understanding the motivation for this test in the first place

sbillinge commented 1 month ago

Yes, I am with Bob on this. I still don't understand what the test is trying to achieve. this is what we want to get to the bottom of.

sbillinge commented 1 month ago

...to expand on that. Maybe we expect that we will encounter NaNs in some array and we don't want the code to crash when that happens for some reason. We decide to handle it by allowing this situation to proceed but to provide a warning to the user that this happened. One way to do that is to handle the NaNs but trigger a RunTimeWarning. In this case I would write a test to ensure this behavior. I would make an input array with NaN's then I would test that it doesn't crash but I would test that it produces a RunTimeWarning with the desired warning message. If it does, the test would pass and the testing framework would capture the warning so we would see it as a green dot and no warning. This is not happening so we need to fix the test, not change the code, which is working as expected (not crashing and triggering a runtime warning).

Another possibility is that I want the code to crash when it encounters NaN's, in which case I want the test to fail if it doesn't crash when the test gives it NaNs. In this case I need to update the code and the test here (the code is not crashing and the test is not failing when it doesn't crash). And so on....

But I am guessing and presenting different scenarios...what we want to do is have a conversation on here which is more along these lines. What behavior do we want the code to have in different scenarios...then we write tests to test that behavior. @cadenmyers13 I appreciate your work on this, but you are delving into the wrong thing...how the code methods work, not how we want our code to behave. Once we decide what behavior we want, we can look into the best way to implement that, in which case we want to get into different code methods that will deliver our desired behavior. Does that make sense?