explosion / sense2vec

🦆 Contextually-keyed word vectors
https://explosion.ai/blog/sense2vec-reloaded
MIT License
1.62k stars 240 forks source link

Prodigy and version of sense2vec - process is constantly killed #127

Open kuatroka opened 3 years ago

kuatroka commented 3 years ago

Hi, When I follow this tutorial on how to combine Prodigy and the 2019 version of Sense2vec

I constantly get CLI message "killed" with no further description on what to do to correct it. This only happens with the s2v_reddit_2019_lg/s2v_reddit_2019_lg version. The s2v_reddit_2015_md/s2v_old is working perfectly with the same parameters

In CLI I run prodigy sense2vec.teach ner-client-dataset ./assets/s2v_reddit_2019_lg/s2v_reddit_2019_lg --seeds "Walmart, Apple"

and I get Killed

When I use prodigy sense2vec.teach ner-client-dataset ./assets/s2v_reddit_2015_md/s2v_old --seeds "Walmart, Apple" all works fine

Thanks

abishekvashok commented 3 years ago

Hey it gets killed most likey due to memory issues, the 2015 edition is just a gig, while the 2019 verson is 3.9gb in size alone. So there's a lot more of memory usage and when the resources get exhausted the system terminates the process.

myeghaneh commented 2 years ago

I have the same problem! I have trained my own S2V, but as soon as I run it, it kill the kernel

corradofiore commented 1 year ago

This is essentially a RAM-related issue. You need lots of RAM. We were having the same problem and we tackled it using a dedicated server from Hetzner. They have some 512 GB RAM boxes in their "Server Auction" section which are pretty cost-effective.