Closed coolcoder001 closed 2 years ago
So building the entity data (before any model is loaded on the GPU) is a CPU bound process. It is a heavily parallel process that builds a large matrix of entity token ids etc to be access during training. I usually build this on a machine with 100-150GB of memory. If you use fewer entities, it'll take up less memory. You can also try to set dataset_threads
to be 1. That will also reduce the memory pressure on the CPU.
Hi @lorr1 ,
The entity_embedding_tutorial file not using GPU on Google Colab , even though GPU is available.
I am using Google Colab Pro+. Colab Pro+ has 51 GB RAM.
Building entity data from scratch. -- this steps fails everytime due to out of memory (RAM) error. It is supposed to use GPU , right ?
Here are the config parameters: