NVIDIA-Merlin / models

Merlin Models is a collection of deep learning recommender system model reference implementations
https://nvidia-merlin.github.io/models/main/index.html
Apache License 2.0
262 stars 50 forks source link

Remove use of deprecated start_index from Categorify #1134

Closed oliverholworthy closed 1 year ago

oliverholworthy commented 1 year ago

Goals :soccer:

Fix use of Categorify to support new version of NVTabular

Implementation Details :construction:

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

github-actions[bot] commented 1 year ago

Documentation preview

https://nvidia-merlin.github.io/models/review/pr-1134

oliverholworthy commented 1 year ago

This is the error that is printed out in gpu-ci. Looks like something to do with the horovod init. I wonder if we should be the horovod init side-effect (from the merlin.models.tf import)

--------------------------------------------------------------------------
Sorry!  You were supposed to get help about:
    mpi_init:startup:internal-failure
But I couldn't open the help file:
    /build-result/hpcx-v2.13-gcc-inbox-ubuntu20.04-cuda11-gdrcopy2-nccl2.12-x86_64/ompi/share/openmpi/help-mpi-runtime.txt: No such file or directory.  Sorry!
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[6bbe81b6bdba:01522] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!