Inconsistency in ddf.astype(Arrow Dict)

Describe the issue: When using an arrow dict type of string: int32, if directly set on the dask dataframe the values are coerced to a pa.large_string but if set from a preexisting coerced pandas df, the dask dataframe is correctly set to string: int32.

Minimal Complete Verifiable Example:

import pandas as pd
import pyarrow as pa
import dask.dataframe as dd

df = pd.DataFrame(data={'cat_col': ['FOO', 'BAR']})
ddf = dd.from_pandas(df)

data_type = pd.ArrowDtype(pa.dictionary(pa.int32(), pa.string()))

new_ddf = ddf.astype(dtypes={'cat_col': data_type})
assert(new_ddf.dtypes.cat_col != data_type)
assert(new_ddf.dtypes.cat_col == pd.ArrowDtype(pa.dictionary(pa.int32(), pa.large_string())))

new_df = df.astype(dtype={'cat_col': pd.ArrowDtype(pa.dictionary(pa.int32(), pa.string()))})
assert(new_df.dtypes.cat_col == data_type)

new_ddf_from_pandas_coerced = dd.from_pandas(new_df)
assert(new_ddf_from_pandas_coerced.dtypes.cat_col == data_type)

Anything else we need to know?: I would assume either way (string or large string) the behavior here should be consistent. My recommendation would be to keep it as a pa.string since that is the pandas behavior, but maybe there is a reason this has changed?

Environment:

Dask version: 2024.4.2
Dask Expr vesion: 1.0.13
Python version: 3.10
Operating System: Mac OSx
Install method (conda, pip, source): PIP

dask / dask

Inconsistency in ddf.astype(Arrow Dict) #11078