NVIDIA-Merlin / Merlin

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.
Apache License 2.0
713 stars 111 forks source link

[BUG] FileNotFoundError when apply Categorify after JoinExternal #1078

Open PaulSteffen-betclic opened 8 months ago

PaulSteffen-betclic commented 8 months ago

Bug description

When I try to use the Categorify() operator on a column after a JoinExternal(), I have the following kind of error: FileNotFoundError: [Errno 2] No such file or directory: '~/data/categories/unique.XXXXX.parquet'

Expected behavior

Save file unique values of categorified column, like the ones done before the JoinExternal operation.

Environment details

rnyak commented 8 months ago

@PaulSteffen-betclic can you pls send us a toy repro example? thanks.