Open mpdprot opened 3 years ago
My best guess is that the radial basis functions are taking a while to initialize... That's a huge model. Like >>1T of memory huge.
with 6 heads of dimension 48, and hidden dim 256, the model uses ~48gb with input of size 300. that's with max degree 2... You're looking at about ~100x that with those settings (since higher order types will give order(type)^2 overhead in memory).
would recommend specifying much smaller hidden dimensions for degrees>0, and no more than 256 for type-0 hidden dimension.
I am trying to run an example from the README. The code is:
The output hangs on 'Initialising model...' and eventually the kernel dies.
Any ideas why this would be happening?
Here is my
pip freeze
:Here is a summary of my system info (
lshw -short
):