Closed wileeam closed 10 years ago
Hi again Guillermo,
I'll look into this issue asap.
Arnau.
On Thu, May 22, 2014 at 2:13 PM, Guillermo notifications@github.com wrote:
Hi!
we are having trouble generating a dataset with 'small' numbers for the parameters. See below (hadoop configured for single node run and one thread set in the script run.sh and using the latest commit as of this posting, da42eb5):
numPersons:1000 startYear:2014 numYears:10 serializerType:csv enableCompression:false Same problem happens if choosing 10000. And basically we don't get any data in the different files generated. We get this error a few times...
14/05/22 14:08:57 INFO mapred.JobClient: Task Id : attempt_201405211824_0061_r_000000_2, Status : FAILED java.lang.ArrayIndexOutOfBoundsException: 91 at ldbc.socialnet.dbgen.generator.ScalableGenerator.generatePosts(ScalableGenerator.java:1026) at ldbc.socialnet.dbgen.generator.ScalableGenerator.generateUserActivity(ScalableGenerator.java:816) at ldbc.socialnet.dbgen.generator.MRGenerateUsers$UserActivityReducer.reduce(MRGenerateUsers.java:277) at ldbc.socialnet.dbgen.generator.MRGenerateUsers$UserActivityReducer.reduce(MRGenerateUsers.java:247) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:177) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Is this a limitation of the generator or are we doing something wrong because we find no trouble generating a dataset when using the setting of the scale factor #3 for example.
Thanks!
— Reply to this email directly or view it on GitHub.
Hey Guillermo, The issue should be fixed. Thanks Andrey for the fix. Regards, Arnau
Hello,
an update on this. We just successfully generated a dataset with 1K people for a period of 10 years without errors. I'll close this issue for now :)
Thanks a lot!
/Guillemro
Hi!
we are having trouble generating a dataset with 'small' numbers for the parameters. See below (hadoop configured for single node run and one thread set in the script run.sh and using the latest commit as of this posting, da42eb54a215de86474d142346864057dc6a5624):
Same problem happens when choosing 10000 persons. And basically we don't get any data in the different files generated. We get this error a few times...
Is this a limitation of the generator or are we doing something wrong because we find no trouble generating a dataset when using the setting of the scale factor 3 for example.
Thanks!