Closed tomersagi closed 9 years ago
Hi Tomer, this is definetely something we are going to implement or allow the users to implement their own serializers, for instance, one that directly insterts data into the databas.
Regards,
Arnau El 18/11/2014 12:47, "Tomer Sagi" notifications@github.com escribió:
Hi, I want to generate SF 8000 in the near future. It seems strange to me to run the hadoop job, get 8TB of data which are moved to another 8TB of csv files which are then loaded to >8TB of DB storage. That means that to run an 8TB workload I need at least 16 TB and probably more. Are there any plans for an interactive version where the data is loaded to the DB as it is generated? Thanks, Tomer
— Reply to this email directly or view it on GitHub https://github.com/ldbc/ldbc_snb_datagen/issues/20.
Any timeline on that? I will need it pretty soon...
We dont know. We are currently working to ensure that datagen is able to generate graphs up to 500+ billion edges. Once this is done, I'll work on your request, but I expect about one month. If this is very important for you, you can always download the code and modify it. You should basically implement the Serializer interface.
Regards,
Arnau El 18/11/2014 13:20, "Tomer Sagi" notifications@github.com escribió:
Any timeline on that? I will need it pretty soon...
— Reply to this email directly or view it on GitHub https://github.com/ldbc/ldbc_snb_datagen/issues/20#issuecomment-63461944 .
o.k. please keep updating on this. Thanks
This has been resolved with the introduction of custom serializers.
Hi, I want to generate SF 8000 in the near future. It seems strange to me to run the hadoop job, get 8TB of data which are moved to another 8TB of csv files which are then loaded to >8TB of DB storage. That means that to run an 8TB workload I need at least 16 TB and probably more. Are there any plans for an interactive version where the data is loaded to the DB as it is generated? Thanks, Tomer