possee-org / genai-numpy

MIT License
4 stars 6 forks source link

Bug: Getting errors after building docs on ma.amax function #109

Closed otieno-juma closed 2 months ago

otieno-juma commented 3 months ago

Description: Here is the error message I have received after running checks on the build docs

numpy.ma.amax

File "../../numpy-dev/lib/python3.11/site-packages/numpy/init.py", line ?, in amax Failed example: np.ma.amax(m, axis=0) Expected: masked_array(data=[4, 5, 6], mask=[False, False, False], fill_value=999999) Got: masked_array(data=[4, 5, 6], mask=False, fill_value=999999)

File "../../numpy-dev/lib/python3.11/site-packages/numpy/init.py", line ?, in amax Failed example: np.ma.amax(m, axis=1) Expected: masked_array(data=[3, 6], mask=[False, False], fill_value=999999) Got: masked_array(data=[3, 6], mask=False, fill_value=999999)

Acceptance Criteria:

bmwoodruff commented 3 months ago

@otieno-juma , This appears to be a bug in NumPy itself, or a problem with our machine on Nebari. Have you tried running the np.ma.amax function on your local machine?

Here is the interactive session on Nebari.

Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'1.26.4'
>>> m = np.ma.array([[1, 2, 3], [4, 5, 6]])
>>> np.ma.amax(m, axis=0)
masked_array(data=[4, 5, 6],
             mask=False,
       fill_value=999999)
>>> 

The problem is the mask element above, which apparently something else is expected:

Expected:
masked_array(data=[4, 5, 6],
mask=[False, False, False],
fill_value=999999)

When I run the example in a Jupyter Notebook, or in an interactive python session, I don't get the "expected" mask=[False, False, False], rather I always get mask=False. I saw this as I was reviewing the output, and thought it was odd, but checked it various ways (never ran the doctests).

This sounds like an issue you could submit on NumPy itself. Look to see if someone has reported it already, or something similar. If not, then open a new issue.

If you want to get ambitious, then feel free to see if AI can solve this issue. Somewhere in the source code, the array element has been removed from the mask and the mask was replaced with just a single False. If you can track that down (yourself, or with AI), then you can submit a fix.

The fact that the doctester says one thing is expected, but something else is obtained, suggests the issue could be related to something machine specific, and not NumPy itself. That becomes much trickier to solve.

bmwoodruff commented 3 months ago

This also sounds like a great opportunity to learn how to add a unit test to NumPy. I have no idea why we are getting this odd behavior. Sounds like a fun thing to track down. Unfortunately, very few of the maintainers right now work with the ma module, so there might not be anyone on the core team ready to work on or review this topic. They would love to have someone become an expert on the ma module.

bmwoodruff commented 3 months ago

I would try running the example that gives the unexpected output on several different machines. I'd try it on https://jupyter.org/try-jupyter/lab/. I'd try it on a windows machine. I'd try it on various different installations of NumPy. If you always get False instead of an array of False's, then something is off. If you notice different output based on different instances, then include that in your issue report.

otieno-juma commented 3 months ago

Thank you, Ben, for the feedback. I haven't tried running these examples on my local machine. Now that you have mentioned it, let me try that and document my findings here.

bmwoodruff commented 3 months ago

Are you planning to open an issue on Numpy related to this topic? Did you already? You can close this issue either way (as not planned, or completed). I think opening an issue on Numpy seems reasonable. I'll let you do that, and then take the lead from there.

otieno-juma commented 2 months ago

I was considering marking it as complete since we are not adding examples to functions that already have them. I feel keeping it as an issue in numpy won't be as fruitful

bmwoodruff commented 2 months ago

The issue isn't related to AI generated code, rather it's an issue in Numpy itself. The right spot for the issue would be on Numpy. If you want to submit that issue there, feel free. If not, I may submit the issue (though look to see if someone else has submitted something similar already). The array portion of mask is being dropped and replaced with a single value, which seems like a bug.

otieno-juma commented 2 months ago

Okay, I will check on numpy to see if someone has already posted a similar issue if not i will post it there