NVIDIA-Merlin / core

Core Utilities for NVIDIA Merlin
Apache License 2.0
19 stars 14 forks source link

add npy export capability to dataset #337

Closed jperez999 closed 1 year ago

jperez999 commented 1 year ago

This PR adds support for exporting a dataset to an npy file. There are certain caveats and checks that must abided by in order for this to work. It allows for two methods of writing, first as one large dataframe dump, second is with append, meaning you can have larger than memory sized data added to the file. List Columns must be exploded via the column axis to write to an npy file. A helper method has been created to facilitate that action.

github-actions[bot] commented 1 year ago

Documentation preview

https://nvidia-merlin.github.io/core/review/pr-337