modularml / mojo

The Mojo Programming Language
https://docs.modular.com/mojo/manual/
Other
23.01k stars 2.59k forks source link

[BUG] `PythonObject` doesn't handle large `UInt64` correctly #3510

Open soraros opened 2 weeks ago

soraros commented 2 weeks ago

Bug description

As title.

A Mojo UInt64 value went through the following to reach Python runtime:

An unwanted bit cast happened on CPython's API boundary.

We should

Steps to reproduce

from python import PythonObject

def main():
  n = UInt64(0xFEDCBA0987654321)
  print(PythonObject(n))  # prints -81986143110479071

A more peculiar repro:

from python import Python

def main():
  py = Python.import_module("builtins")
  a = UInt64(0x1234567890ABCDEF)
  b = UInt64(0xFEDCBA0987654321)
  # bb = py.int(str(b))      # doesn't crash
  a, bb = a, py.int(str(b))  # only crashes when tuple unpacking is used
  print(a + bb)

System information

Mojo 2024.9.905 (ae516e17) on Docker, Intel Mac
dhruvmalik007 commented 1 week ago

hi @JoeLoser , here is the code that will act patch for this bug : https://github.com/dhruvmalik007/mojo/commit/bf8d434b67f196437adc91e9b1f28e0fdf6f1e01 , let me know if you need more tests / changes required and I will do the needful in order to create PR and merge .

Thanks

soraros commented 1 week ago

Hi @dhruvmalik007, thanks for your interest! Feel free to make a PR and we can iterate there.

dhruvmalik007 commented 1 week ago

Hey i have created the PR , with the merging the nighly branch as described in the mojo.

soraros commented 5 days ago

@dhruvmalik007 We should probably switch to PyLong_FromSsize_t and PyLong_FromSize_t for ABI compatibility.

dhruvmalik007 commented 5 days ago

thanks for first review and sure will do.

just to clarify then what the condition we should define for the branch

    fn __init__[dt: DType](inout self, value: SIMD[dt, 1]):
          ## ......  other implementations
          elif dt == DType.size_t:
                    ##  equating the PyLong_FromSsize_t to the  pyobject. 
          elif dt == DType.ssize_t:
                    ## implementation of the PyLong_FromSize_t to the pyobject.

but the issue is that the definition of the Dtypes depends on the MLIR attr as defined here , and these had to be defined for the unsigned int and int of arbitrary lengths size_t and ssize_t .

@soraros maybe you can check my recent commit and guide how to implement it. thanks