NVIDIA-Merlin / HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
Apache License 2.0
946 stars 200 forks source link

[BUG] DIN sample refers to old version of NVTabular and produces error when running w/ 22.09 container #364

Closed jsohn-nvidia closed 2 years ago

jsohn-nvidia commented 2 years ago

Describe the bug The DIN sample in the repo (https://github.com/NVIDIA-Merlin/HugeCTR/tree/main/samples/din) seems to be using outdated NVTabular API and produces an error when running in the 22.09 container

To Reproduce Follow the instructions shown in the sample page.

Expected behavior The sample runs without an error

Additional context Have not tested w/ 22.10 which was released recently, but expecting to not work either as the sample hasn't been updated for 4 months.

haochuan-li commented 2 years ago

22.10 also has the same issue. The 4_nvt_process.py will run into an error. "No module named column_group in nvtabular" which is an old API from v0.6.1. After I change this line, it gives me another error.

image

Anyone can help with this?

Thanks!

jershi425 commented 2 years ago

@Spidey0918 The fixed script will be available in next release. For a quick fix, you can replace the line features = LABEL + ColumnGroup(CAT_COLUMNS) with features = LABEL + CAT_COLUMNS >> FillMissing() and add from nvtabular.ops import FillMissing on the top. Use this fix is not sufficient to remove all the warnings but should be enough to get the script working. Thanks and let me know if you have any other questions.

haochuan-li commented 2 years ago

@jershi425 Thanks! It helps a lot. I can get the din model working.