Open allenfengjr opened 1 year ago
For my first question, I think it should be day_0
, so there is no problem for that.
Have you successfully preprocessed the criteo 1TB by this method? I would like to know how much RAM is needed for this process. During my trial, it always says "distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 189.95 GiB -- Worker memory limit: 200.00 GiB"
Have you successfully preprocessed the criteo 1TB by this method? I would like to know how much RAM is needed for this process. During my trial, it always says "distributed.worker.memory - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker-memory.html#memory-not-released-back-to-the-os for more information. -- Unmanaged memory: 189.95 GiB -- Worker memory limit: 200.00 GiB"
No, I gave up using NVTabular.
Hello, I am trying to pre-process criteo TB dataset using NVTabular. I run the
nvt_preproc.sh
nvt_preproc but get errors.This is the error code.
I read the error code, according to my current understanding, I think it is caused by a data type mismatch when reading the file. But I do download the original files from this website original data. I'm hoping to get some help telling me why this is and how I should fix it.
I still have two more questions, the first one is that after I downloaded these files and decompressed them with
gzip -dk $filename
, I got some files which names areday_0
,day_1
... instead ofday_0.csv
,day_1.csv
. Will this cause some mistakes? I think this is only a difference in the suffix, but I am not sure.Another question is it seems that I am running
NVTabular Dataset
in CPU mode, I have already installcupy
,rmm
and some other libraries, what should I do to run this program in CUDA mode?Thank you for your help!