apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.63k stars 3.56k forks source link

[Python] Jobs fail if Pyarrow version is not correctly generated due to missing remote dev tags #44803

Open raulcd opened 19 hours ago

raulcd commented 19 hours ago

Describe the bug, including details regarding any error messages, version, and platform.

Sometimes we get some job failures when a contributor makes a PR and their remote git repository does not contain development tags. The pyarrow generated version on those cases looks like pyarrow-0.1.dev16896+ge3b9892. An example of this issue can be seen on this comment: https://github.com/apache/arrow/pull/44720#issuecomment-2490314967 and the build fails with:

 self = <[AttributeError("'ArrowDtype' object has no attribute 'pyarrow_dtype'") raised in repr()] ArrowDtype object at 0x7f5b9b45f610>
pyarrow_dtype = DataType(int64)

    def __init__(self, pyarrow_dtype: pa.DataType) -> None:
        super().__init__("pyarrow")
        if pa_version_under10p1:
>           raise ImportError("pyarrow>=10.0.1 is required for ArrowDtype")
E           ImportError: pyarrow>=10.0.1 is required for ArrowDtype

on the:

We should find a workaround for those cases instead of having to tell users to push tags to their remote.

Component(s)

Python