dice-group / WHALE

0 stars 0 forks source link

Implement memory map approach using `mmappickle` #4

Closed sshivam95 closed 2 weeks ago

sshivam95 commented 2 weeks ago

Dice-embedding do not support reading from memory mapped files. It directly reads the file and stores it in the main memory which causes memory overload issues if the knowledge base file is larger than main memory. Here we use mmappickle library which is a memory mapped pickled file to create indices of relations and entities.

This helps in creating a transformed training set into a numpy.ndarray of indexed train data.

sshivam95 commented 2 weeks ago

Fix

sshivam95 commented 2 weeks ago

Test on the extracted files --> Running on extracted 10M Noctua 1 - (3066765)