NationalSecurityAgency / lemongraph

Log-based transactional graph engine
Other
1.14k stars 150 forks source link

bulk import #18

Open tamerh opened 6 years ago

tamerh commented 6 years ago

Hello,

I have sorted and numeric large key and values around 15 billions records. I am able to insert to lmdb database in reasonable time.

my question Is there a way to insert this data as bulk to lemongraph and can it handle this?

Thanks

NSA-LGDev2 commented 6 years ago

I am not sure - supporting graphs of that size was outside the scope of this project.

However, to stress test, I once loaded a graph with 10 million nodes with empty type and sequentially numbered values, and then added 100 million random edges in batches of 1 million per transaction. Load time was around 4 hours, result db was 30gb, and the edge insert rate at the end was around 3.7k per second.

You would definitely need to break it into multiple transactions so that it can automatically grow LMDB's map size. Depending on your hardware, it might be a good idea to set noreadahead=True when you open it as well.

Does that help?