open2c / bioframe

Genomic interval operations on Pandas DataFrames
MIT License
173 stars 28 forks source link

Python 3.13: test_assembly_info fails with AttributeError #209

Open penguinpee opened 5 months ago

penguinpee commented 5 months ago

Fedora is preparing for the next Python release and currently testing packages with Python 3.13.0b1. It turns out one of the tests is failing with Python 3.13. I'm not entirely sure if this is an issue with bioframe or with pandas. I'm reporting it here as a starting point and for awareness. We are using version 2.2.1 of pandas.

______________________________ test_assembly_info ______________________________
    def test_assembly_info():
>       hg38 = assembly_info("hg38")
tests/test_assembly_info.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../BUILDROOT/python-bioframe-0.6.4-1.fc41.x86_64/usr/lib/python3.13/site-packages/bioframe/io/assembly.py:144: in assembly_info
    result = assemblies.query(q)
/usr/lib64/python3.13/site-packages/pandas/core/frame.py:4811: in query
    res = self.eval(expr, **kwargs)
/usr/lib64/python3.13/site-packages/pandas/core/frame.py:4937: in eval
    return _eval(expr, inplace=inplace, **kwargs)
/usr/lib64/python3.13/site-packages/pandas/core/computation/eval.py:328: in eval
    env = ensure_scope(
/usr/lib64/python3.13/site-packages/pandas/core/computation/scope.py:58: in ensure_scope
    return Scope(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <[AttributeError("'Scope' object has no attribute 'resolvers'") raised in repr()] Scope object at 0x7f3716c1aa70>
level = 4, global_dict = None, local_dict = None
resolvers = ({'cytobands': 0     hg19.cytoband.tsv
1     hg19.cytoband.tsv
2     hg38.cytoband.tsv
3     hg38.cytoband.tsv
4      ...    7
8      8
9      9
10    10
11    11
12    12
13    13
14    14
15    15
16    16
17    17
18    18
dtype: int64})
target = None
    def __init__(
        self, level: int, global_dict=None, local_dict=None, resolvers=(), target=None
    ) -> None:
        self.level = level + 1

        # shallow copy because we don't want to keep filling this up with what
        # was there before if there are multiple calls to Scope/_ensure_scope
        self.scope = DeepChainMap(DEFAULT_GLOBALS.copy())
        self.target = target

        if isinstance(local_dict, Scope):
            self.scope.update(local_dict.scope)
            if local_dict.target is not None:
                self.target = local_dict.target
            self._update(local_dict.level)

        frame = sys._getframe(self.level)

        try:
            # shallow copy here because we don't want to replace what's in
            # scope when we align terms (alignment accesses the underlying
            # numpy array of pandas objects)
            scope_global = self.scope.new_child(
                (global_dict if global_dict is not None else frame.f_globals).copy()
            )
            self.scope = DeepChainMap(scope_global)
            if not isinstance(local_dict, Scope):
                scope_local = self.scope.new_child(
>                   (local_dict if local_dict is not None else frame.f_locals).copy()
                )
E               AttributeError: 'FrameLocalsProxy' object has no attribute 'copy'
/usr/lib64/python3.13/site-packages/pandas/core/computation/scope.py:176: AttributeError
musicinmybrain commented 5 months ago

The attempt to call copy() on frame.f_locals (which is, in Python 3.13, a FrameLocalsProxy), is in Pandas code, so this is some sense a Pandas bug.

However, based on https://github.com/python/cpython/issues/118921 and the associated (merged) PR https://github.com/python/cpython/pull/118933, it looks like the fix will be in CPython itself.

nvictus commented 5 months ago

Interesting. This comes from the use of pandas.eval (via DataFrame.query()), which would presumably affect a lot of pandas code in the wild.