NVIDIA-Merlin / NVTabular

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Apache License 2.0
1.04k stars 143 forks source link

[FEA] Test HugeCTR notebooks as part of the CI process #1131

Open benfred opened 2 years ago

benfred commented 2 years ago

We need to test the hugectr movielens notebooks as part of the CI process, testing both the criteo and movielens example notebooks.

For instance with movielens we test the PT/TF notebooks in this function: https://github.com/NVIDIA/NVTabular/blob/a0b14b5a2cbc7989ad00adcd357817367a18c00d/tests/unit/test_notebooks.py#L156

Likewise we have similar tests for criteo here: https://github.com/NVIDIA/NVTabular/blob/a0b14b5a2cbc7989ad00adcd357817367a18c00d/tests/unit/test_notebooks.py#L36 https://github.com/NVIDIA/NVTabular/blob/a0b14b5a2cbc7989ad00adcd357817367a18c00d/tests/unit/test_notebooks.py#L89

EvenOldridge commented 2 years ago

@zehuanw per our earlier discussion we'll be prioritizing this. We should figure out how to integrate our CI process so it can support HugeCTR.

albert17 commented 2 years ago

@benfred @jperez999 Our CI container does not have HugeCTR installed. So we need to solve that