NVIDIA-Merlin / Merlin

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Apache License 2.0
776 stars 118 forks source link

Hadoop #97

Open albert17 opened 2 years ago

albert17 commented 2 years ago

Can we get hadoop downloaded/uncompress/cleaned only once?

It happens twice:

  1. https://github.com/NVIDIA-Merlin/Merlin/blob/main/docker/dockerfile.tri#L49

  2. https://github.com/NVIDIA-Merlin/Merlin/blob/main/docker/dockerfile.tri#L225

Aha! Link: https://nvaiinfa.aha.io/features/MERLIN-833

jershi425 commented 2 years ago

Hi @albert17, I don't see how Hadoop is downloaded/uncompress/cleaned more than once. Can you explain how L49 is related to Hadoop?