NVIDIA-Merlin / core

Core Utilities for NVIDIA Merlin
Apache License 2.0
19 stars 14 forks source link

Use `"float64"` dtypes in `DaskExecutor.fit_phase` #378

Closed rjzamora closed 4 months ago

rjzamora commented 4 months ago

Addresses NVTabular test failures with rapids-24.04 (pandas-2).

This PR does not "fix" the existing dtype-preservation problem:

            # TODO: constructing meta like this loses dtype information on the ddf
            # and sets it all to 'float64'. We should propagate dtype information along
            # with column names in the columngroup graph. This currently only
            # happens during intermediate 'fit' transforms, so as long as statoperators
            # don't require dtype information on the DDF this doesn't matter all that much

However, it makes pandas-2 behavior consistent with pandas<2.

github-actions[bot] commented 4 months ago

Documentation preview

https://nvidia-merlin.github.io/core/review/pr-378