GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
768 stars 221 forks source link

clib.conversion._to_numpy: Add tests for pandas.Series with pandas string dtype #3607

Closed seisman closed 1 week ago

seisman commented 2 weeks ago

Description of proposed changes

Add tests for pandas.Series with string dtype. Six cases are tested:

  1. dtype=None
  2. dtype=np.str_
  3. dtype="U10"
  4. dtype="string[python]"
  5. dtype="string[pyarrow]"
  6. dtype="string[pyarrow_numpy]"

Neither can be converted to np.str_ directly. Cases 4-6 can be fixed by 01ba31786ddf424027b3c3472339cd43d6fb49d5, and cases 1-3 can be fixed by dac7e8eda7632464f556eaae5f1ac8ac34c9b6bd.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: x = pd.Series(["abc", "defg", "12345"], dtype=None)

In [4]: x.dtype
Out[4]: dtype('O')

In [5]: np.ascontiguousarray(x)
Out[5]: array(['abc', 'defg', '12345'], dtype=object)

In [6]: x = pd.Series(["abc", "defg", "12345"], dtype=np.str_)

In [7]: x.dtype
Out[7]: dtype('O')

In [8]: np.ascontiguousarray(x)
Out[8]: array(['abc', 'defg', '12345'], dtype=object)

In [9]: x = pd.Series(["abc", "defg", "12345"], dtype="U10")

In [10]: x.dtype
Out[10]: dtype('O')

In [11]: x = pd.Series(["abc", "defg", "12345"], dtype="string[python]")

In [12]: x.dtype
Out[12]: string[python]

In [13]: str(x.dtype)
Out[13]: 'string'

In [14]: np.ascontiguousarray(x)
Out[14]: array(['abc', 'defg', '12345'], dtype=object)

In [15]: x = pd.Series(["abc", "defg", "12345"], dtype="string[pyarrow]")

In [16]: x.dtype
Out[16]: string[pyarrow]

In [17]: str(x.dtype)
Out[17]: 'string'

In [18]: np.ascontiguousarray(x)
Out[18]: array(['abc', 'defg', '12345'], dtype=object)

In [19]: x = pd.Series(["abc", "defg", "12345"], dtype="string[pyarrow_numpy]")

In [20]: x.dtype
Out[20]: string[pyarrow_numpy]

In [21]: str(x.dtype)
Out[21]: 'string'

In [22]: np.ascontiguousarray(x)
Out[22]: array(['abc', 'defg', '12345'], dtype=object)