NVIDIA-Merlin / models

Merlin Models is a collection of deep learning recommender system model reference implementations
https://nvidia-merlin.github.io/models/main/index.html
Apache License 2.0
248 stars 50 forks source link

[QST] #1211

Closed fahadullahshah261 closed 9 months ago

fahadullahshah261 commented 10 months ago

❓ Questions & Help

I had an issue while importing merlin.models.tf.

Details

I run the following: pip install merlin-models

But, when I run the following: import merlin.models.tf as mm

I got the following error message:

19 import numpy as np 20 import tensorflow as tf ---> 21 from keras.utils.tf_inspect import getfullargspec 22 from packaging import version 23 from tensorflow.python import to_dlpack

ModuleNotFoundError: No module named 'keras.utils.tf_inspect'

rnyak commented 10 months ago

@fahadullahshah261 how do you installl merlin-models? we recommend to use merlin-tensorflow:23.06 docker image.

fahadullahshah261 commented 10 months ago

@rnyak, I was using Merlin to train my recommender model and it worked initially when I installed it as follows: pip install merlin-models import merlin.models.tf

But, recently I had an issue while importing it as mentioned above. Moreover, it is cumbersome to deal with docker in Google Colab. So, what do I need to do to be able to use Merlin in Colab as I used earlier? Thanks.

rnyak commented 10 months ago

@fahadullahshah261 you can follow these steps here to run it on colab:

https://medium.com/nvidia-merlin/how-to-run-merlin-on-google-colab-83b5805c63e0

note that you need other Merlin libraries to be installed as well to be able to use merlin-models. Besides, you need to install tensorflow-gpu. you can use TF 2.19 or TF 2.10

fahadullahshah261 commented 9 months ago

Thanks, @rnyak! I tried it and it worked.

dking21st commented 6 months ago

@fahadullahshah261 how do you installl merlin-models? we recommend to use merlin-tensorflow:23.06 docker image.

@rnyak, can you help me figuring out what am I doing wrong with my docker image? I'm new to airflow / docker, but I'm trying to use those to deploy my merlin model training & uploading job. I built docker image with following dockerfile:

FROM --platform=linux/amd64 nvcr.io/nvidia/merlin/merlin-tensorflow:23.06 as prod

WORKDIR /ads_content

COPY ./data-airflow .
COPY ./ads/images/requirements.txt .

WORKDIR /root

#RAPIDs
RUN pip install tf2onnx==1.15.1
RUN pip install -r /ads_content/requirements.txt
RUN pip install requests "urllib3<2"

WORKDIR /ads_content

ENTRYPOINT ["python3"]

but when I try to run my Python file with this image, I'm always seeing this error message.

tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

I searched on Google and this error is occurring when the graphic driver is outdated / not updated. Should I install / add steps to dockerfile separately?