NVIDIA-Merlin / core

Core Utilities for NVIDIA Merlin
Apache License 2.0
19 stars 14 forks source link

Fix udf op #342

Closed jperez999 closed 1 year ago

jperez999 commented 1 year ago

This PR fixes issues with the UDF op outlined here https://github.com/NVIDIA-Merlin/systems/issues/360. This issue occurs when the user defined function is created with a targeted framework. However if the function is setup with a specific framework, the output would end up as the framework specified. This causes issues with the expected output (which should be the same as the input). The fix added here uses a dictionary instead of an input type collection. After the collection is complete we create a dataframe to ensure continued support for downstream operators. This is necessary to be able to all the different types of inputs possible (i.e. TensorTable, Cudf Dataframe, Pandas Dataframe). If using tensortable, it will be converted before the next operator in the executor.

github-actions[bot] commented 1 year ago

Documentation preview

https://nvidia-merlin.github.io/core/review/pr-342

oliverholworthy commented 1 year ago

It's probably worth adding a test for this situation where we pass a pandas data frame but the user defined function returns a cudf series and/or the other way around

jperez999 commented 1 year ago

It's probably worth adding a test for this situation where we pass a pandas data frame but the user defined function returns a cudf series and/or the other way around

Test added.