ldbc-dev / ldbc_snb_datagen_deprecated2015

LDBC-SNB Data Generator
GNU General Public License v3.0
12 stars 5 forks source link

Python error in paramgenerator scripts? #6

Closed wileeam closed 10 years ago

wileeam commented 10 years ago

Hello,

the new code in the script run.sh about param generation causes an error (see below) at the end of the generation process of the dataset... The generation of the dataset is completed successfully though

14/06/02 10:19:29 INFO mapred.JobClient:     Map output records=1000
309 total seconds
Warning: $HADOOP_HOME is deprecated.

Warning: $HADOOP_HOME is deprecated.

loading input for parameter generation
find parameter bindings for Persons
find parameter bindings for Countries
find parameter bindings for Tags
find parameter bindings for Timestamps
Traceback (most recent call last):
  File "paramgenerator/generateparams.py", line 216, in <module>
    sys.exit(main())
  File "paramgenerator/generateparams.py", line 164, in main
    selectedTimeParams = findTimeParams(timeSelectionInput, argv[1], argv[2], ts[1])
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 201, in findTimeParams
    output[queryId] = findTimeParameters(input[queryId][0], mycFactors, input[queryId][1], input[queryId][2])
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 33, in findTimeParameters
    timeParams = timestampSelection(factors,medians)
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 66, in getTimeParamsWithMedian
    if currentMedian.count > median:
AttributeError: 'int' object has no attribute 'count'
agubichev commented 10 years ago

Hello, can you tell what is the scaleFactor that you are using?

wileeam commented 10 years ago

Hi!

our own... see below:

    <scale_factor number="2" >
        <num_persons>1000</num_persons>
        <start_year>2014</start_year>
        <num_years>10</num_years>
    </scale_factor>

I haven't tried setting this parameters directly as it was possible before. Awaiting further information just in case.

agubichev commented 10 years ago

i pushed the fix, can you try it now?

ArnauPrat commented 10 years ago

btw, I have improved the way parameters are passed to the DBGEN. Here you are the new instructions:

https://github.com/ldbc/ldbc_socialnet_bm/wiki/Compilation_Execution

Arnau

2014-06-02 13:23 GMT+02:00 agubichev notifications@github.com:

i pushed the fix, can you try it now?

— Reply to this email directly or view it on GitHub https://github.com/ldbc/ldbc_socialnet_bm/issues/6#issuecomment-44825945 .

wileeam commented 10 years ago

Now... float division by zero for the same parameters as before...

4/06/02 16:11:14 INFO mapred.JobClient:     Map output records=1000
309 total seconds
14/06/02 16:11:14 INFO util.NativeCodeLoader: Loaded the native-hadoop library
loading input for parameter generation
find parameter bindings for Persons
find parameter bindings for Countries
find parameter bindings for Tags
find parameter bindings for Timestamps
Traceback (most recent call last):
  File "paramgenerator/generateparams.py", line 216, in <module>
    sys.exit(main())
  File "paramgenerator/generateparams.py", line 164, in main
    selectedTimeParams = findTimeParams(timeSelectionInput, argv[1], argv[2], ts[1])
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 205, in findTimeParams
    output[queryId] = findTimeParameters(input[queryId][0], mycFactors, input[queryId][1], input[queryId][2])
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 31, in findTimeParameters
    timeParams = timestampSelection(factors,medians)
  File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 74, in getTimeParamsWithMedian
    duration = int(28*median/currentMedian.count)
ZeroDivisionError: float division by zero
ArnauPrat commented 10 years ago

Hi Guillermo,

Thank you for reporting these issues. In any case, these errors do not affect the generation of the social network, since they are from a post process to generate substitution parameters for the LDBC-SNB. Therefore, if you are interested only on the social network generated, the output files should still be correct.

Regards,

Arnau.

On Mon, Jun 2, 2014 at 4:12 PM, Guillermo notifications@github.com wrote:

Now... float division by zero for the same parameters as before...

4/06/02 16:11:14 INFO mapred.JobClient: Map output records=1000 309 total seconds 14/06/02 16:11:14 INFO util.NativeCodeLoader: Loaded the native-hadoop library loading input for parameter generation find parameter bindings for Persons find parameter bindings for Countries find parameter bindings for Tags find parameter bindings for Timestamps Traceback (most recent call last): File "paramgenerator/generateparams.py", line 216, in sys.exit(main()) File "paramgenerator/generateparams.py", line 164, in main selectedTimeParams = findTimeParams(timeSelectionInput, argv[1], argv[2], ts[1]) File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 205, in findTimeParams output[queryId] = findTimeParameters(input[queryId][0], mycFactors, input[queryId][1], input[queryId][2]) File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 31, in findTimeParameters timeParams = timestampSelection(factors,medians) File "/NOBACKUP/ldbc_socialnet_bm/ldbc_socialnet_dbgen/paramgenerator/timeparameters.py", line 74, in getTimeParamsWithMedian duration = int(28*median/currentMedian.count) ZeroDivisionError: float division by zero — Reply to this email directly or view it on GitHub.

agubichev commented 10 years ago

Should be fixed for now, but I assume you do not use parameters anyways. Their generation can be disabled by setting PARAM_GENERATION to 0 in run.sh

wileeam commented 10 years ago

Correct! No errors now! Thanks @ArnauPrat and @agubichev I was aware that the dataset was generated without errors but thought about letting you know about this.

Closing this issue :) Now the inconsistency test comes...